Haruo Hosoya: “Towards computational understanding of facial coding in the macaque”
12:15 〜 13:00
CiNet 1F Conference Room
Department of Dynamic Brain Imaging
Brain Information Communication Research Laboratory Group
Host PI : Okito Yamashita
Although the primate inferotemporal cortex has long been known as a neural basis of visual object representation, its precise coding seems hopelessly complicated. A potential window to this mystery is the face processing network, for which recent physiogical studies have revealed multiple face-selective regions, their tight inter-connections, their region-specific tuning properties, and so on. However, the crucially lacking is the understanding of the computational principle underlying the face processing network. To this end, we took the first step by introducing a novel theory called mixture of sparse coding models, inspired by both category specificity in the inferotemporal cortex and classical sparse coding theory commonly used for earlier vision. Notably, our concrete hierarchical network with a mixture of two sparse coding submodels, each trained with face or non-face object images, explained, qualitatively and quantitatively, almost all response properties reported by Freiwald, Tsao, and Livingstone (2009) on the face-processing middle patch, including not only selectivity to face category but also various types of tuning to detailed facial features. Not all these could be reproduced with a simple sparse coding model, a multi-layer perceptron, or state-of-art convolutional neural networks. Thus, we hypothesize that facial coding in the macaque face-processing middle patch may be closely related to mixture of sparse coding models, particularly emphasizing the role of its top-down explaining-away effect, formalized as Bayesian inference, which enables recognition of an individual part to depend strongly on the category of the whole input image. We also touch on our next step for modeling the
remaining part of the face processing network as well as the potential > of modern deep generative odels in this context.
The Friday Lunch Seminar is CiNet's main regular meeting series, held every week at 12:15 in the beautiful main lecture theatre on the ground floor at CiNet. The talks are typically 40mins long and orientated towards an inter-disciplinary audience. They are informal, social, and most people bring their own lunch to eat during the talk. They are open to anyone who is feeling curious and wants to come, regardless of where you work.