Machine learning

We aim at the emergence of "intelligence" through a deep understanding of "learning" and "inference" in machine learning. In particular, we focus on the three processes that constitute human memory: "encoding," "retention," and "retrieval (decoding)." We aim to explore the concepts corresponding to these processes in machine learning and to conduct empirical research and construct theories on them.

"Without A Theory The Facts Are Silent"
            —Friedrich August von Hayek.


In machine learning, data that is understandable and concrete to humans, such as images and text, is transformed into a real-valued vector that is suitable for machines to handle. By using this real-valued vector, machines perform prediction, judgment, inference, and generation. We consider that the process of converting "data" into "information" corresponds to "encoding." To "retain" such real-valued vector information in vast amounts of training data within a machine learning model, it is typically "retained" as internal parameters through a learning algorithm. It is necessary to flexibly retrieve the "retained" "information" in an adaptively appropriate form.


It is crucial to analyze what properties the real-valued vectors should possess as feature vectors representing that data. This field of study is known as "representation learning." For example, if the training data is represented as a 512-dimensional real vector, the arrangement of this data in the 512-dimensional real space might represent the arrangement (geometric structure of retention) of the training data in the memory area of AI. In such cases, it might be desirable for training data with similar properties to be located in similar memory areas.

If it is possible to recall appropriate information from the training data for unknown data and make accurate predictions, inference on unknown data becomes feasible. Furthermore, if it is possible to accurately capture the features of various real-world data formats and learn from them, then making accurate predictions for new categories or new tasks with only a few amount of data might become possible (known as few-shot learning, zero-shot learning, or in-context learning). Additionally, if the information can be retrieved in a flexible form, it might also become possible to generate high-quality text, images, and various data.


Robustness in biology is also an important property for the survival of the species. They have evolved by changing their morphology and traits in response to changes in the environment. In machine learning, robustness is a property that is necessary for learning appropriately even when the training data or the data to be predicted are different from what is expected. This property is important for real-world applications of machine learning systems and is closely related to generalization. This property is related to a characteristic property of perturbations to data input and the memorization of deep learning. We consider that analyzing the robustness to such perturbations will lead to a deeper understanding of the structure and generalization of memory in deep neural networks.