牧野 浩史 (マキノ ヒロシ)

Makino, Hiroshi

写真a

所属(所属キャンパス)

医学部 生理学教室 (信濃町)

職名

教授

 

論文 【 表示 / 非表示

  • Distributed representations of temporally accumulated reward prediction errors in the mouse cortex

    Makino H., Suhaimi A.

    Science Advances 11 ( 4 ) eadi4782 2025年01月

    ISSN  2375-2548

     概要を見る

    Reward prediction errors (RPEs) quantify the difference between expected and actual rewards, serving to refine future actions. Although reinforcement learning (RL) provides ample theoretical evidence suggesting that the long-term accumulation of these error signals improves learning efficiency, it remains unclear whether the brain uses similar mechanisms. To explore this, we constructed RL-based theoretical models and used multiregional two-photon calcium imaging in the mouse dorsal cortex. We identified a population of neurons whose activity was modulated by varying degrees of RPE accumulation. Consequently, RPE-encoding neurons were sequentially activated within each trial, forming a distributed assembly. RPE representations in mice aligned with theoretical predictions of RL, emerging during learning and being subject to manipulations of the reward function. Interareal comparisons revealed a region-specific code, with higher-order cortical regions exhibiting long-term encoding of RPE accumulation. These results present an additional layer of complexity in cortical RPE computation, potentially augmenting learning efficiency in animals.

  • Emergence of cortical network motifs for short-term memory during learning

    Chia X.W., Tan J.K., Ang L.F., Kamigaki T., Makino H.

    Nature Communications 14 ( 1 )  2023年12月

     概要を見る

    Learning of adaptive behaviors requires the refinement of coordinated activity across multiple brain regions. However, how neural communications develop during learning remains poorly understood. Here, using two-photon calcium imaging, we simultaneously recorded the activity of layer 2/3 excitatory neurons in eight regions of the mouse dorsal cortex during learning of a delayed-response task. Across learning, while global functional connectivity became sparser, there emerged a subnetwork comprising of neurons in the anterior lateral motor cortex (ALM) and posterior parietal cortex (PPC). Neurons in this subnetwork shared a similar choice code during action preparation and formed recurrent functional connectivity across learning. Suppression of PPC activity disrupted choice selectivity in ALM and impaired task performance. Recurrent neural networks reconstructed from ALM activity revealed that PPC-ALM interactions rendered choice-related attractor dynamics more stable. Thus, learning constructs cortical network motifs by recruiting specific inter-areal communication channels to promote efficient and robust sensorimotor transformation.

  • Arithmetic value representation for hierarchical behavior composition

    Makino H.

    Nature Neuroscience 26 ( 1 ) 140 - 149 2023年01月

    ISSN  10976256

     概要を見る

    The ability to compose new skills from a preacquired behavior repertoire is a hallmark of biological intelligence. Although artificial agents extract reusable skills from past experience and recombine them in a hierarchical manner, whether the brain similarly composes a novel behavior is largely unknown. In the present study, I show that deep reinforcement learning agents learn to solve a novel composite task by additively combining representations of prelearned action values of constituent subtasks. Learning efficacy in the composite task was further augmented by the introduction of stochasticity in behavior during pretraining. These theoretical predictions were empirically tested in mice, where subtask pretraining enhanced learning of the composite task. Cortex-wide, two-photon calcium imaging revealed analogous neural representations of combined action values, with improved learning when the behavior variability was amplified. Together, these results suggest that the brain composes a novel behavior with a simple arithmetic operation of preacquired action-value representations with stochastic policies.

  • Representation learning in the artificial and biological neural networks underlying sensorimotor integration

    Suhaimi A., Lim A.W.H., Chia X.W., Li C., Makino H.

    Science Advances 8 ( 22 )  2022年06月

     概要を見る

    The integration of deep learning and theories of reinforcement learning (RL) is a promising avenue to explore novel hypotheses on reward-based learning and decision-making in humans and other animals. Here, we trained deep RL agents and mice in the same sensorimotor task with high-dimensional state and action space and studied representation learning in their respective neural networks. Evaluation of thousands of neural network models with extensive hyperparameter search revealed that learning-dependent enrichment of state-value and policy representations of the task-performance-optimized deep RL agent closely resembled neural activity of the posterior parietal cortex (PPC). These representations were critical for the task performance in both systems. PPC neurons also exhibited representations of the internally defined subgoal, a feature of deep RL algorithms postulated to improve sample efficiency. Such striking resemblance between the artificial and biological networks and their functional convergence in sensorimotor integration offers new opportunities to better understand respective intelligent systems.