ePoster

Neural Representations of Opponent Strategy Support the Adaptive Behavior of Recurrent Actor-Critics in a Competitive Game

Mitchell Ostrow,Guangyu Robert Yang,Hyojung Seo
COSYNE 2022(2022)
Lisbon, Portugal

Conference

COSYNE 2022

Lisbon, Portugal

Resources

Authors & Affiliations

Mitchell Ostrow,Guangyu Robert Yang,Hyojung Seo

Abstract

In social interactions, theory of mind (ToM) is postulated as the reason for why humans can successfully interact with novel people, given that everyone has beliefs, desires, and goals. Most ToM studies design explicit models to represent other agents’ mental states, using this to inform behavior. However, it is unclear whether or how ToM could emerge through learning to act in a social environment. We sought to identify network mechanisms of ToM in a recurrent neural network trained to play a competitive neuroscience task that requires ToM. The network, trained using deep reinforcement learning, is pitted against a set of opponents who play according to a distribution of algorithmic strategies. Importantly, this network does not have an explicitly hand-crafted ToM mechanism, and thus must learn it in order to perform successfully. Surprisingly, our network plays adaptively against unseen strategies. We hypothesized to observe a structured representation of opponent strategy within neural activity, which we postulated would be a rudimentary form of ToM. We fit a linear classifier to the network’s recurrent activity and found that we could predict the opponent with 96% accuracy, even after a period where the agent played against various randomly-selected opponents. To clarify that this subspace functions as a putative representation, we sought to relate its relationship to behavior. We found that the emergence of adaptive behavior and a robust opponent representation were significantly correlated across training. We subsequently perturbed the neural activity in this subspace from one opponent region to another and found that the reward after perturbation dropped to almost random, indicating that the representation is necessary for adaptive behavior. Our work demonstrates that learning adaptive social behavior is sufficient to develop a basic ToM, and additionally provides an explanation for how neural networks perform ToM which could be utilized to inspire future experiments.

Unique ID: cosyne-22/neural-representations-opponent-strategy-998bbc58