Friday Lunch Seminar: Kazufumi Hosoda: "Annual update on our group" (for CiNet members only)

Friday Lunch Seminar (English)
November 17, 2023  
12:15 〜 13:00 (JST)
at CiNet Conference Room in the CiNet bldg.

Talk Title:Annual update on our group

Kazufumi Hosoda
Senior Researcher
Neural Information Engineering Laboratory
Center for Information and Neural Networks (CiNet)
National Institutes of Information and Communications Technology (NICT)

Host PI :  Kazufumi Hosoda

Abstract:
I will be sharing an update on our group’s progress over the past year. It has been two years since our group was established, and perhaps due to my background not being in neuroscience, the recognition within CiNet would be relatively low. Therefore, I plan to share information about our ongoing research activities, including non-research-related ones. Honestly, I would like to present this broadly, but given that the discussion involves confidential information and details about what each group member is working on, I would like to limit this seminar to In-person only. I kindly ask for your understanding. Since some of you may not be able to come on the day, I put a long abstract of the research part below:
・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・
Our grand objective is to comprehend the essence of life, humanity, and the self by modeling brain functions, a feature absent in current artificial intelligence, and to consequently pioneer the development of an ultra-energy-efficient brain-like computer. Specifically, we are focusing on the element of insight (Hirameki or Eureka effect) in recognizing degraded images. Intriguingly, this insight connects directly to elements of life and the self, and harbors potential for contributing to the development of energy-efficient computing.

Let us consider a task involving the recognition of an image of an apple so degraded it is barely identifiable. Before recognizing the whole image as an apple, we can not discern what each part of the image is depicted. Consequently, this is not about computing the likelihood across a thousand categories, as done in deep learning classifiers, and selecting the highest probability. Amidst abundant irrelevant data, we must navigate an endless array of choices to identify the “apple.” This task encompasses defining the problem’s framework itself. Also, the task is not a matter of arbitrary invention like arts; there is a definitive answer. Additionally, due to the absence of prior similar tasks, it demands a statistical or logical leap, requiring the application of a different “experience.” It necessitates the integration of seemingly unrelated past tasks or activities to address this unlearned challenge. Notably, this is not the same as transfer learning due to its unlearned nature; it is about employing knowledge without prior learning specific to the task. Here, “experience” can be understood as information stored to be applicable in any unlearned situation. This ability to cope with unexpected situation, known as “adaptability” in the integrated field of information theory, complex systems theory, and computational biology, is regarded as the essence of living systems. A similar concept is later advocated in the free-energy principle. Out group feels this “experience” closely resembles our impression of the “self” (what are your thoughts?). Consequently, by developing artificial intelligence capable of utilizing experience for insight in unlearned tasks, we believe we can directly approach the creation of an AI possessing a sense of self and capable of no-learn applicability.

What does the specific mechanism underlie? Past research (Murata 2014, PlosOne) has revealed that it involves generating missing information and creatively supplementing degraded images. It is crucial to note that understanding what is missing in an apple, for instance, is not apparent until the object is recognized as apple. In other words, the recognition of something being an apple and the completion of information necessary for that recognition occur simultaneously. Studies using fMRI have shown that when human creativity is at work, both the brain’s default mode network and the executive network are engaged concurrently. However, measuring the insight with fMRI comes with challenges. After all, an insight is a phenomenon of the moment. Moreover, as mentioned, it is not just the brain visual system involved, but potentially the entire brain. This suggests that there could be a dramatic, holistic change in the brain’s state, akin to a phase transition, just before and after that moment. Additionally, while it is feasible in experiments for participants to press a button when they have “got it,” signifying the completion of conscious thought, other brain areas might have arrived at the answer just before that. Naturally, each part of the brain functions in synchrony with the rest, implying an order within that moment, even when examined closely. Unraveling these dynamics is undoubtedly key, but it demands sophisticated techniques to achieve high temporal resolution with fMRI. We are challenging this elucidation by further advancing the analytical methods we have previously developed (Murata 2022, JNeurosci).

While such experimental elucidation, we are also adopting an approach that involves reconstructing both the software and hardware aspects (rather, this is the main one). We started by creating a simplistic model that simulates the statistical characteristics observed in human insight into degraded images. This simulation was achieved through a probabilistic model using deep learning, with input image data identical to that used in human experiments (Murata 2014 PlosOne). Since the input images used in human experiments include classifications that are not present in typical deep learning systems, we complemented them with one-shot learning using Hebbian theory (Hosoda 2022 arxiv). As a result, the statistical characteristics were reproduced, and when we employed Vision Transformer (ViT) instead of Convolutional Neural Networks (CNN), we observed a correlation with humans regarding the difficulty level of the degraded images (Hosoda 2023 KICSS). However, this model only simulated the statistical characteristics of the time it takes for inspiration to occur; it does not mean that the answer was reached by inspiration. Moreover, this is entirely probabilistic and does not utilize experience. To incorporate experience, we are currently taking on the challenge using high-degree-of-freedom chaos.

To avoid any misunderstanding, let us clarify that our focus is not on the “degraded image recognition task” per se. We are interested in “difficult tasks that are unlearned.” This is because, if the degraded-image-recognition task is learned, we believe that current deep learning systems already would outperform humans. To test this, we initially degraded a simple image library (CIFAR100) through binarization and trained deep learning models on it. Although we have not directly compared it with human performance yet, the models answered correctly even in classifications that are presumably impossible for humans. Furthermore, when using ViT instead of CNN, it was found that just by learning the binarized images, the system could respond with a higher accuracy rate to the original color images (Lim 2023, presentation planned for SfN). While these studies are not directly related to insights, understanding them is essential when researching the insight involved in degraded image recognition task.

We are aspiring towards the hardware implementation of computational modeling. Suppose we managed to implement insights using Self-attention, the foundation of current deep learning. However, if generalized, this could consume vast amounts of energy worldwide, becoming unsustainable for our planet and ecosystems. We explored the principles of information processing through high-degree-of-freedom chaos utilizing noise or fluctuation. Specifically, we constructed the simplest encoder-decoder model as the basis of information processing. As a result, we discovered a principle that allows robust information processing with signals about ten times the thermal noise (international patent applied). Hypothetically, if the noise is 0.1mV, there is a potential for operation at 1mV. Since contemporary silicon transistor-based computers require approximately 1V, creating a 1mV computer could ideally lead to efficiency a million times greater (the square of 1000). Of course, numerous challenges remain, but we have not yet encountered opinions deeming it impossible, making it arguably a dream-like brain-type computer.

We know all of you are very busy, but it would be appreciated if you all could come as much as possible and enjoy the discussion!