Timing and Value Signals in the Anterior Cingulate Cortex under a Reinforcement Learning Framework
Abstract
During foraging, animals need to decide whether to stay in a depleting resource or to leave it for a potentially better source of reward. Prior studies implicate an important role for the anterior cingulate cortex (ACC) in foraging decisions: within single trials, ACC activity increases immediately preceding foraging decisions, and across trials, these dynamics are modulated as the value of staying in the patch depletes to the average reward rate, where the reward rate follows the general predictions of optimal foraging theory (the marginal value theorem; MVT). Additionally, ACC has been proposed to keep track of action-value space and mediate adaptive behaviors. However, animal behaviors often deviate from MVT predicted optimality, and a class of learning algorithms, Temporal Difference (TD) Reinforcement Learning, has been proposed to describe the neural computations of intertemporal decision making. Yet, Neural activities proposedly representing value hasn’t been rigorously contrasted to value predicted by TD operated over typical time state space. This work aims to observe temporal decision-making behavior in a novel paradigm, which allows investigation of a continuous decision evolving through time in ACC under TD-based algorithms. Broadly, it would afford a means to distinguish between timing algorithms and representation architecture used by the brain to learn the value of opposing actions that evolve through time.
Neuroscience PhD candidate in the Johns Hopkins School of Medicine
© 2023 Ziyi Guo