Skip to content Skip to menu
This website uses cookies to help us understand the way visitors use our website. We can't identify you with them and we don't share the data with anyone else. If you click Reject we will set a single cookie to remember your preference. Find out more in UCL's privacy notices.

Exploitation-exploration dilemma: toward an algorithmic mapping of adaptive behavior in the human prefrontal cortex

Abstract: Adverse choice outcomes may either be used to continuously improve an ongoing behavioral strategy or to trigger its reset, allowing for the active exploration of novel environments. Resolving this so-called exploitation-exploration dilemma is critical for cognitive flexibility and depends on the prefrontal cortex. Little is known, however, about the neuro-computational mechanisms implementing these explore/exploit decisions. In this talk, I will present a series of recent studies from our group, using intracranial electrophysiological recordings (iEEG), stroke lesion symptom mappings and computational modelling of decision-making, investigating the role of the anterior dorsal insula (AI) and the medial prefrontal cortex (mPFC) in strategic explore/exploit decisions. Our results reveal that the ventral mPFC continuously tracks the reliability of the current strategy, proactively shaping how upcoming outcomes are encoded as either learning signals or trigger for explorations. In contrast, the dorsal mPFC encodes choice outcomes according to this reliability signal, mediating the tradeoff between adapting or resetting the ongoing strategy. We further show that the processing of negative outcomes during explore/exploit decisions relies on a distinct neural circuit selectively disrupted by insular lesions, highlighting its specific contribution: AI lesions patients fail to learn from negative outcomes, and instead misinterpret them as a signal to bypass—rather than adapt—the current strategy. Finally, I will introduce the strategy inference model, a novel computational framework accounting for human adaptability in volatile environments, which outperform previous models by forgoing gradual adaptation of strategies through state-action reinforcement learning and solely use decision outcomes as strategy reliability update signals instead. I will present clear behavioral markers and early human iEEG PFC recordings supporting the broad use of inference over strategies. Taken together, our results establish the importance of direct inference over abstract strategy spaces for flexible adaptation in humans.