CiNet Seminar by Dan Yamins (Stanford University) and Ko Nishino (Drexel University)

Date: Jan. 16, 2017
Time: 15:00 – 18:00
Place: CiNet 1F Conference Room
Host:  Izumi Ohzawa (PI)

This seminar is organized and supported by CiNet, and supported also by Grant-in-Aid for Scientific Research on Innovative Area, “Innovation in SHITSUKAN Science and Technology”.

Talk-1: 15:00 – 16:30
“Using Artificial-Intelligence-Driven Deep Neural Networks to Uncover Principles of Brain Representation and Organization”
Daniel L. K. Yamins
Assistant Professor of Psychology and Computer Science, Stanford University
Investigator, Stanford Neurosciences Institute
(Abstract below)

Talk-2: 16:30 – 18:00
“Computational Material Perception”
Ko Nishino
Professor
Associate Department Head for Graduate Affairs
Department of Computer Science, Drexel University
(Abstract below)

Abstract for Talk-1 (Yamins):
Human behavior is founded on the ability to identify meaningful entities in complex noisy data streams that constantly bombard the senses. For example, in vision, retinal input is transformed into rich object-based scenes; in audition, sound waves are transformed into words and sentences. In this talk, I will describe my work using computational models to help uncover how sensory cortex accomplishes these enormous computational feats.
The core observation underlying my work is that optimizing neural networks to solve challenging real-world artificial intelligence (AI) tasks can yield predictive models of the cortical neurons that support these tasks. I will first describe how we leveraged recent advances in AI to train a neural network that approaches human-level performance on a challenging visual object recognition task. Critically, even though this network was not explicitly fit to neural data, it is nonetheless predictive of neural response patterns of neurons in multiple areas of the visual pathway, including higher cortical areas that have long resisted modeling attempts. Intriguingly, an analogous approach turns out be helpful for studying audition, where we recently found that neural networks optimized for word recognition and speaker identification tasks naturally predict responses in human auditory cortex to a wide spectrum of natural sound stimuli, and help differentiate poorly understood non-primary auditory cortical regions. Together, these findings suggest the beginnings of a general approach to understanding sensory processing the brain.
I’ll give an overview of these results, explain how they fit into the historical trajectory of AI and computational neuroscience, and discuss future questions of great interest that may benefit from a similar approach.

References :
1. Hong H, Yamins DLK, Majaj NJ, DiCarlo JJ (2016)
Explicit information for category-orthogonal object properties increases along the ventral stream. Nat Neurosci 19:613-622.
2. Yamins DL, DiCarlo JJ (2016a)
Eight open questions in the computational modeling of higher sensory cortex. Current Opinion in Neurobiology 37:114-120.
3. Yamins DLK, DiCarlo JJ (2016b)
Using goal-driven deep learning models to understand sensory cortex. Nat Neurosci 19:356-365.
4. Yamins DLK, Hong H, Cadieu CF, Solomon EA, Seibert D, DiCarlo JJ (2014)
Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences 111:8619-8624.

Abstract for Talk-2 (Nishino) :
Information regarding what an object is made of–its material–can provide crucial clues for scene understanding. If a robot, for instance, detects soft dirt or a smooth metal surface ahead, it can adjust its movement in advance. Recognizing materials solely from images, however, has proven to be a difficult problem. In this talk, I will present our research geared towards realizing computational material perception.
I will first introduce a generative approach, in which we aim to decompose the image into its building blocks–geometry, illumination, and reflectance. I will show how the space of real-world reflectance can be faithfully encoded with a novel reflectance model, which lends itself to robust Bayesian estimation of reflectance in complex real-world environments. I will then discuss a discriminative approach in which we directly try to classify each pixel of an image into different materials. We introduce a novel intermediate representation, called visual material traits, that represent the appearance of material properties like “smooth” and “shiny,” and use them to recognize materials locally without any knowledge of the object. We further show that these material attributes can be learned from weak human perceptual supervision and, in fact, naturally arise inside a deep neural network trained end-to-end for local material recognition.
Finally, I will show how global image context such as object and place estimates can be integrated to modulate local material estimates. I will discuss parallel findings in neuroscience and psychology and discuss how our results can provide a sound foundation for a multi-disciplinary effort for unravelling the inner workings of human material perception and the realization of computational material perception.

Ko Nishino is a full professor in the Department of Computer Science at Drexel University. He is also an adjunct professor in the Computer and Information Science Department of the University of Pennsylvania and a visiting professor of Osaka University and the National Institute of informatics in Japan. He received a B.E. and an M.E. in Information and Communication Engineering in 1997 and 1999, respectively, and a PhD in Computer Science in 2002, all from The University of Tokyo. Before joining Drexel University in 2005, he was a Postdoctoral Research Scientist in the Computer Science Department at Columbia University. His primary research interests lie in computer vision and include appearance modeling and synthesis, geometry processing, and video analysis. He received the NSF CAREER award in 2008.Yns