← Back

Reward Learning

Topic spotlight
TopicWorld Wide

reward learning

Discover seminars, jobs, and research tagged with reward learning across World Wide.
28 curated items26 Seminars2 ePosters
Updated 11 months ago
28 items · reward learning
28 results
SeminarNeuroscience

The Neurobiology of the Addicted Brain

Thanos Panayotis K.
Department of Pharmacology & Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo,
Jan 8, 2025
SeminarNeuroscience

Decomposing motivation into value and salience

Philippe Tobler
University of Zurich
Oct 31, 2024

Humans and other animals approach reward and avoid punishment and pay attention to cues predicting these events. Such motivated behavior thus appears to be guided by value, which directs behavior towards or away from positively or negatively valenced outcomes. Moreover, it is facilitated by (top-down) salience, which enhances attention to behaviorally relevant learned cues predicting the occurrence of valenced outcomes. Using human neuroimaging, we recently separated value (ventral striatum, posterior ventromedial prefrontal cortex) from salience (anterior ventromedial cortex, occipital cortex) in the domain of liquid reward and punishment. Moreover, we investigated potential drivers of learned salience: the probability and uncertainty with which valenced and non-valenced outcomes occur. We find that the brain dissociates valenced from non-valenced probability and uncertainty, which indicates that reinforcement matters for the brain, in addition to information provided by probability and uncertainty alone, regardless of valence. Finally, we assessed learning signals (unsigned prediction errors) that may underpin the acquisition of salience. Particularly the insula appears to be central for this function, encoding a subjective salience prediction error, similarly at the time of positively and negatively valenced outcomes. However, it appears to employ domain-specific time constants, leading to stronger salience signals in the aversive than the appetitive domain at the time of cues. These findings explain why previous research associated the insula with both valence-independent salience processing and with preferential encoding of the aversive domain. More generally, the distinction of value and salience appears to provide a useful framework for capturing the neural basis of motivated behavior.

SeminarNeuroscience

Decoding mental conflict between reward and curiosity in decision-making

Naoki Honda
Hiroshima University
Jul 9, 2023

Humans and animals are not always rational. They not only rationally exploit rewards but also explore an environment owing to their curiosity. However, the mechanism of such curiosity-driven irrational behavior is largely unknown. Here, we developed a decision-making model for a two-choice task based on the free energy principle, which is a theory integrating recognition and action selection. The model describes irrational behaviors depending on the curiosity level. We also proposed a machine learning method to decode temporal curiosity from behavioral data. By applying it to rat behavioral data, we found that the rat had negative curiosity, reflecting conservative selection sticking to more certain options and that the level of curiosity was upregulated by the expected future information obtained from an uncertain environment. Our decoding approach can be a fundamental tool for identifying the neural basis for reward–curiosity conflicts. Furthermore, it could be effective in diagnosing mental disorders.

SeminarNeuroscience

Richly structured reward predictions in dopaminergic learning circuits

Angela J. Langdon
National Institute of Mental Health at National Institutes of Health (NIH)
May 16, 2023

Theories from reinforcement learning have been highly influential for interpreting neural activity in the biological circuits critical for animal and human learning. Central among these is the identification of phasic activity in dopamine neurons as a reward prediction error signal that drives learning in basal ganglia and prefrontal circuits. However, recent findings suggest that dopaminergic prediction error signals have access to complex, structured reward predictions and are sensitive to more properties of outcomes than learning theories with simple scalar value predictions might suggest. Here, I will present recent work in which we probed the identity-specific structure of reward prediction errors in an odor-guided choice task and found evidence for multiple predictive “threads” that segregate reward predictions, and reward prediction errors, according to the specific sensory features of anticipated outcomes. Our results point to an expanded class of neural reinforcement learning algorithms in which biological agents learn rich associative structure from their environment and leverage it to build reward predictions that include information about the specific, and perhaps idiosyncratic, features of available outcomes, using these to guide behavior in even quite simple reward learning tasks.

SeminarNeuroscience

Obesity and Brain – Bidirectional Influences

Alain Dagher
McGill University
Apr 10, 2023

The regulation of body weight relies on homeostatic mechanisms that use a combination of internal signals and external cues to initiate and terminate food intake. Homeostasis depends on intricate communication between the body and the hypothalamus involving numerous neural and hormonal signals. However, there is growing evidence that higher-level cognitive function may also influence energy balance. For instance, research has shown that BMI is consistently linked to various brain, cognitive, and personality measures, implicating executive, reward, and attentional systems. Moreover, the rise in obesity rates over the past half-century is attributed to the affordability and widespread availability of highly processed foods, a phenomenon that contradicts the idea that food intake is solely regulated by homeostasis. I will suggest that prefrontal systems involved in value computation and motivation act to limit food overconsumption when food is scarce or expensive, but promote over-eating when food is abundant, an optimum strategy from an economic standpoint. I will review the genetic and neuroscience literature on the CNS control of body weight. I will present recent studies supporting a role of prefrontal systems in weight control. I will also present contradictory evidence showing that frontal executive and cognitive findings in obesity may be a consequence not a cause of increased hunger. Finally I will review the effects of obesity on brain anatomy and function. Chronic adiposity leads to cerebrovascular dysfunction, cortical thinning, and cognitive impairment. As the most common preventable risk factor for dementia, obesity poses a significant threat to brain health. I will conclude by reviewing evidence for treatment of obesity in adults to prevent brain disease.

SeminarNeuroscienceRecording

Dissecting the neural circuits underlying prefrontal regulation of reward and threat responsivity in a primate

Angela Roberts
Department of Physiology, Development and Neuroscience, University of Cambridge
Feb 14, 2022

Gaining insight into the overlapping neural circuits that regulate positive and negative emotion is an important step towards understanding the heterogeneity in the aetiology of anxiety and depression and developing new treatment targets. Determining the core contributions of the functionally heterogenous prefrontal cortex to these circuits is especially illuminating given its marked dysregulation in affective disorders. This presentation will review a series of studies in a new world monkey, the common marmoset, employing pathway-specific chemogenetics, neuroimaging, neuropharmacology and behavioural and cardiovascular analysis to dissect out prefrontal involvement in the regulation of both positive and negative emotion. Highlights will include the profound shift of sensitivity away from reward and towards threat induced by localised activations within distinct regions of vmPFC, namely areas 25 and 14 as well as the opposing contributions of this region, compared to orbitofrontal and dorsolateral prefrontal cortex, in the overall responsivity to threat. Ongoing follow-up studies are identifying the distinct downstream pathways that mediate some of these effects as well as their differential sensitivity to rapidly acting anti-depressants.

SeminarNeuroscienceRecording

Astrocytes encode complex behaviorally relevant information

Katharina Merten
Nimmerjahn Lab, Salk Institute
Jan 25, 2022

While it is generally accepted that neurons control complex behavior and brain computation, the role of non-neuronal cells in this context remains unclear. Astrocytes, glial cells of the central nervous system, exhibit complex forms of chemical excitation, most prominently calcium transients, evoked by local and projection neuron activity. In this talk, I will provide mechanistic links between astrocytes’ spatiotemporally complex activity patterns, neuronal molecular signaling, and behavior. Using a visual detection task, in vivo calcium imaging, robust statistical analyses, and machine learning approaches, my work shows that cortical astrocytes encode the animal's decision, reward, performance level, and sensory properties. Behavioral context and motor activity-related parameters strongly impact astrocyte responses. Error analysis confirms that astrocytes carry behaviorally relevant information, supporting astrocytes' complementary role to neuronal coding beyond their established homeostatic and metabolic roles.

SeminarNeuroscienceRecording

Roles of attention and consciousness in perceptual learning

Kazuhisa Shibata
RIKEN Center for Brain Science
Dec 12, 2021

Visual perceptual learning (VPL) is defined as improved performance on a visual task due to visual experience. It was once argued that attention to a visual feature is necessary for VPL of the feature to occur. Contrary to this view, a phenomenon called task-irrelevant VPL demonstrated that VPL can occur due to exposure to a feature which is sub-threshold and task-irrelevant, and therefore, unattended. A series of findings based on task-irrelevant VPL has indicated the following two mechanisms. First, attention to a feature facilitates VPL of the feature while inhibiting VPL of unattended and supra-threshold features. Second, reward paired with a feature enables VPL of the feature irrespective of whether the feature is attended or not. However, we recently found an additional twist; VPL of a task-irrelevant and supra-threshold feature embedded in a natural scene is not subject to the inhibition of attention. This new finding suggests a need to revise the current view or add a new mechanism as to how VPL occurs.

SeminarNeuroscienceRecording

Striatal circuitry for reward learning and decision-making

Ilana Witten
Princeton University
Oct 18, 2021
SeminarPsychology

What are the consequences of directing attention within working memory?

Evie Vergauwe
University of Geneva
Oct 7, 2021

The role of attention in working memory remains controversial, but there is some agreement on the notion that the focus of attention holds mnemonic representations in a privileged state of heightened accessibility in working memory, resulting in better memory performance for items that receive focused attention during retention. Closely related, representations held in the focus of attention are often observed to be robust and protected from degradation caused by either perceptual interference (e.g., Makovski & Jiang, 2007; van Moorselaar et al., 2015) or decay (e.g., Barrouillet et al., 2007). Recent findings indicate, however, that representations held in the focus of attention are particularly vulnerable to degradation, and thus, appear to be particularly fragile rather than robust (e.g., Hitch et al., 2018; Hu et al., 2014). The present set of experiments aims at understanding the apparent paradox of information in the focus of attention having a protected vs. vulnerable status in working memory. To that end, we examined the effect of perceptual interference on memory performance for information that was held within vs. outside the focus of attention, across different ways of bringing items in the focus of attention and across different time scales.

SeminarNeuroscience

Behavioral and neurobiological mechanisms of social cooperation

Yina Ma
Beijing Normal University
Jun 29, 2021

Human society operates on large-scale cooperation and shared norms of fairness. However, individual differences in cooperation and incentives to free-riding on others’ cooperation make large-scale cooperation fragile and can lead to reduced social-welfare. Deciphering the neural codes representing potential rewards/costs for self and others is crucial for understanding social decision-making and cooperation. I will first talk about how we integrate computational modeling with functional magnetic resonance imaging to investigate the neural representation of social value and the modulation by oxytocin, a nine-amino acid neuropeptide, in participants evaluating monetary allocations to self and other (self-other allocations). Then I will introduce our recent studies examining the neurobiological mechanisms underlying intergroup decision-making using hyper-scanning, and share with you how we alter intergroup decisions using psychological manipulations and pharmacological challenge. Finally, I will share with you our on-going project that reveals how individual cooperation spreads through human social networks. Our results help to better understand the neurocomputational mechanism underlying interpersonal and intergroup decision-making.

SeminarNeuroscience

Dopaminergic modulation of synaptic plasticity in learning and psychiatric disorders

Sho Yagishita
University of Tokyo
Jun 27, 2021

Transient changes in dopamine activity in response to reward and punishment have been known to regulate reward-related learning. However, the cellular basis that detects the transient dopamine signaling has long been unclear. Using two-photon microscopy and optogenetics, I have shown that transient increases and decreases of dopamine modulate plasticity of dopamine D1 and D2 receptor-expressing cells in the nucleus accumbens, respectively. At the behavioral level, I characterized that these D1 and D2 cells cooperatively tune learning by generalization and discrimination learning. Interestingly, disturbance of the dopamine signaling impaired D2 cell plasticity and discrimination learning, which was analogous to salience misattribution seen in subjects with schizophrenia.

SeminarNeuroscienceRecording

A reward-learning framework of knowledge acquisition: How we can integrate the concepts of curiosity, interest, and intrinsic-extrinsic rewards

Kou Murayama
Tübingen University
Jun 10, 2021

Recent years have seen a considerable surge of research on interest-based engagement, examining how and why people are engaged in activities without relying on extrinsic rewards. However, the field of inquiry has been somewhat segregated into three different research traditions which have been developed relatively independently -- research on curiosity, interest, and trait curiosity/interest. The current talk sets out an integrative perspective; the reward-learning framework of knowledge acquisition. This conceptual framework takes on the basic premise of existing reward-learning models of information seeking: that knowledge acquisition serves as an inherent reward, which reinforces people’s information-seeking behavior through a reward-learning process. However, the framework reveals how the knowledge-acquisition process is sustained and boosted over a long period of time in real-life settings, allowing us to integrate the different research traditions within reward-learning models. The framework also characterizes the knowledge-acquisition process with four distinct features that are not present in the reward-learning process with extrinsic rewards -- (1) cumulativeness, (2) selectivity, (3) vulnerability, and (4) under-appreciation. The talk describes some evidence from our lab supporting these claims.

SeminarNeuroscienceRecording

Acetylcholine dynamics in the basolateral amygdala during reward learning

Marina Picciotto
Yale School of Medicine
May 26, 2021
SeminarNeuroscienceRecording

Dynamic reward signaling in ventral basal ganglia circuits

Patricia Janak
Johns Hopkins University
Mar 3, 2021
SeminarNeuroscienceRecording

Conflict or complement: Parallel memories control behaviour in Drosophila

Scott Waddell
University of Oxford
Feb 25, 2021

Drosophila can learn to associate odours with reward or punishment and the resulting memories direct odour-specific approach or avoidance behaviours. Recent progress has revealed a straightforward model for learning in which reinforcing dopaminergic neurons assign valence to odour representations in the neural ensemble of the mushroom bodies. Dopamine directed synaptic depression alters the route of odour-driven activity through the mushroom body output network. This circuit configuration and influence of internal state guide the expression of appropriate behaviour. Importantly, learned behaviour is flexible and can be updated as the fly accumulates additional experience. Our latest studies demonstrate that well-informed behaviour is guided by combining parallel conflicting and complementary memories of opposite valence.

SeminarNeuroscience

Reward processing in psychosis: adding meanings to the findings

Suzana Kazanova
Neuroscience, Research Group Psychiatry, Center for Contextual Psychiatry, University of Leuven, Belgium
Dec 7, 2020

Much of our daily behavior is driven by rewards. The ability to learn to pursue rewarding experiences is, in fact, an essential metric of mental health. Conversely, reduced capacity to engage in adaptive goal-oriented behavior is the hallmark of apathy, and present in the psychotic disorder. The search for its underlying mechanisms has resulted in findings of profound impairments in learning from rewards and the associated blunted activation in key reward areas of the brain of patients with psychosis. An emerging research field has been relying on digital phenotyping tools and ecological momentary assessments (EMA) that map patients’ current mood, behavior and context in the flow of their daily lives. Using these tools, we have started to see a different picture of apathy, one that is exquisitely driven by the environment. For one, reward sensitivity appears to be blunted by stressors, and exposure to undue chronic stress in the daily life may result in apathy in those predisposed to psychosis. Secondly, even patients with psychosis who exhibit clinically elevated levels of apathy are perfectly capable of seeking out and enjoying social interactions in their daily life, if their environment allows them to do so. The use of digital phenotyping tools in combination with neuroimaging of apathy not only allows us to add meanings to the neurobiological findings, but could also help design rational interventions.

SeminarNeuroscienceRecording

An inference perspective on meta-learning

Kate Rakelly
University of California Berkeley
Nov 25, 2020

While meta-learning algorithms are often viewed as algorithms that learn to learn, an alternative viewpoint frames meta-learning as inferring a hidden task variable from experience consisting of observations and rewards. From this perspective, learning to learn is learning to infer. This viewpoint can be useful in solving problems in meta-RL, which I’ll demonstrate through two examples: (1) enabling off-policy meta-learning, and (2) performing efficient meta-RL from image observations. I’ll also discuss how this perspective leads to an algorithm for few-shot image segmentation.

SeminarNeuroscienceRecording

The role of spatiotemporal waves in coordinating regional dopamine decision signals

Arif Hamid
Howard Hughes Medical Institute
Oct 14, 2020

The neurotransmitter dopamine is essential for normal reward learning and motivational arousal processes. Indeed these core functions are implicated in the major neurological and psychiatric dopamine disorders such as schizophrenia, substance abuse disorders/addiction and Parkinson's disease. Over the years, we have made significant strides in understanding the dopamine system across multiple levels of description, and I will focus on our recent advances in the computational description, and brain circuit mechanisms that facilitate the dual role of dopamine in learning and performance. I will specifically describe our recent work with imaging the activity of dopamine axons and measurements of dopamine release in mice performing various behavioural tasks. We discovered wave-like spatiotemporal activity of dopamine in the striatal region, and I will argue that this pattern of activation supports a critical computational operation; spatiotemporal credit assignment to regional striatal subexperts. Our findings provide a mechanistic description for vectorizing reward prediction error signals relayed by dopamine.

SeminarNeuroscience

The Dopamine Synapse and Learning

David Sulzer
Columbia University
Sep 28, 2020

The actions of dopamine within the striatum are central to the selection of cortical and perhaps thalamic inputs that mediate learning throughout life, including during operant conditioning, reward and avoidance learning and the establishment of motor patterns. Dysfunction of these synaptic circuits during maturation or aging underlies many neurological, psychiatric and neurodevelopment disorders. We will discuss the biological sequences by which these synapses are altered as an animal interacts with the environment.

SeminarNeuroscience

Corticolimbic Circuitry in Reward Learning and Pursuit

Kate Wassum
University of California, Los Angeles
Sep 22, 2020
SeminarNeuroscience

Mechanisms of Perceptual Learning

Takeo Watanabe
Brown University
Sep 14, 2020

Perceptual learning (PL) is defined as long-term performance improvement on a perceptual task as a result of perceptual experience (Sasaki, Nanez& Watanabe, 2011, Nat Rev Neurosci, 2011). We first found that PL occurs for task-irrelevant and subthreshold features and that pairing task-irrelevant features with rewards is the key to form task-irrelevant PL (TIPL) (Watanabe, Nanez & Sasaki, Nature, 2001; Watanabe et al, 2002, Nature Neuroscience; Seitz & Watanabe, Nature, 2003; Seitz, Kim & Watanabe, 2009, Neuron; Shibata et al, 2011, Science). These results suggest that PL occurs as a result of interactions between reinforcement and bottom-up stimulus signals (Seitz & Watanabe, 2005, TICS). On the other hand, fMRI study results indicate that lateral prefrontal cortex fails to detect and thus to suppress subthreshold task-irrelevant signals. This leads to the paradoxical effect that a signal that is below, but close to, one’s discrimination threshold ends up being stronger than suprathreshold signals (Tsushima, Sasaki & Watanabe, 2006, Science). We confirmed this mechanism with the following results: Task-irrelevant learning occurs only when a presented feature is under and close to the threshold with younger individuals (Tsushima et al, 2009, Current Biol), whereas with older individuals who tend to have less inhibitory control task-irrelevant learning occurs with a feature whose signal is much greater than the threshold (Chang et al, 2014, Current Biol). From all of these results, we conclude that attention and reward play important but different roles in PL. I will further discuss different stages and phases in mechanisms of PL (Seitz et al, 2005, PNAS; Yotsumoto, Watanabe & Sasaki, Neuron, 2008; Yotsumoto et al, Curr Biol, 2009; Watanabe & Sasaki, 2015, Ann Rev Psychol; Shibata et al, 2017, Nat Neurosci; Tamaki et al, 2020, Nat Neurosci).

SeminarNeuroscience

Social reward learning: basic mechanisms and therapuetic opportunities

Gul Dolen
Johns Hopkins University, School of Medicine
Sep 7, 2020
SeminarNeuroscience

Delineating Reward/Avoidance Decision Process in the Impulsive-compulsive Spectrum Disorders through a Probabilistic Reversal Learning Task

Xiaoliu Zhang
Monash University
Jul 18, 2020

Impulsivity and compulsivity are behavioural traits that underlie many aspects of decision-making and form the characteristic symptoms of Obsessive Compulsive Disorder (OCD) and Gambling Disorder (GD). The neural underpinnings of aspects of reward and avoidance learning under the expression of these traits and symptoms are only partially understood. " "The present study combined behavioural modelling and neuroimaging technique to examine brain activity associated with critical phases of reward and loss processing in OCD and GD. " "Forty-two healthy controls (HC), forty OCD and twenty-three GD participants were recruited in our study to complete a two-session reinforcement learning (RL) task featuring a “probability switch (PS)” with imaging scanning. Finally, 39 HC (20F/19M, 34 yrs +/- 9.47), 28 OCD (14F/14M, 32.11 yrs ±9.53) and 16 GD (4F/12M, 35.53yrs ± 12.20) were included with both behavioural and imaging data available. The functional imaging was conducted by using 3.0-T SIEMENS MAGNETOM Skyra syngo MR D13C at Monash Biomedical Imaging. Each volume compromised 34 coronal slices of 3 mm thickness with 2000 ms TR and 30 ms TE. A total of 479 volumes were acquired for each participant in each session in an interleaved-ascending manner. " " The standard Q-learning model was fitted to the observed behavioural data and the Bayesian model was used for the parameter estimation. Imaging analysis was conducted using SPM12 (Welcome Department of Imaging Neuroscience, London, United Kingdom) in the Matlab (R2015b) environment. The pre-processing commenced with the slice timing, realignment, normalization to MNI space according to T1-weighted image and smoothing with a 8 mm Gaussian kernel. " " The frontostriatal brain circuit including the putamen and medial orbitofrontal (mOFC) were significantly more active in response to receiving reward and avoiding punishment compared to receiving an aversive outcome and missing reward at 0.001 with FWE correction at cluster level; While the right insula showed greater activation in response to missing rewards and receiving punishment. Compared to healthy participants, GD patients showed significantly lower activation in the left superior frontal and posterior cingulum at 0.001 for the gain omission. " " The reward prediction error (PE) signal was found positively correlated with the activation at several clusters expanding across cortical and subcortical region including the striatum, cingulate, bilateral insula, thalamus and superior frontal at 0.001 with FWE correction at cluster level. The GD patients showed a trend of decreased reward PE response in the right precentral extending to left posterior cingulate compared to controls at 0.05 with FWE correction. " " The aversive PE signal was negatively correlated with brain activity in regions including bilateral thalamus, hippocampus, insula and striatum at 0.001 with FWE correction. Compared with the control group, GD group showed an increased aversive PE activation in the cluster encompassing right thalamus and right hippocampus, and also the right middle frontal extending to the right anterior cingulum at 0.005 with FWE correction. " " Through the reversal learning task, the study provided a further support of the dissociable brain circuits for distinct phases of reward and avoidance learning. Also, the OCD and GD is characterised by aberrant patterns of reward and avoidance processing.

SeminarNeuroscience

Striatal circuits for reward learning and decision-making

Ilana Witten
Princeton University
Jun 10, 2020

How are actions linked with subsequent outcomes to guide choices? The nucleus accumbens (NAc), which is implicated in this process, receives glutamatergic inputs from the prelimbic cortex (PL) and midline regions of the thalamus (mTH). However, little is known about what is represented in PL or mTH neurons that project to NAc (PL-NAc and mTH-NAc). By comparing these inputs during a reinforcement learning task in mice, we discovered that i) PL-NAc preferentially represents actions and choices, ii) mTH-NAc preferentially represents cues, iii) choice-selective activity in PL-NAc is organized in sequences that persist beyond the outcome. Through computational modelling, we demonstrate that these sequences can support the neural implementation of temporal difference learning, a powerful algorithm to connect actions and outcomes across time. Finally, we test and confirm predictions of our circuit model by direct manipulation of PL-NAc neurons. Thus, we integrate experiment and modelling to suggest a neural solution for credit assignment.

ePoster

Neural circuit mechanisms of bottom-up reward learning

Zachary Zeisler, Fred Stoll, Davide Folloni, Matthew G. Perich, Peter Rudebeck

COSYNE 2025

ePoster

Emerging frontal cortical representations during reward learning

Marko Tvrdic, Orsolya Folsz, Blake Russell, Simon Butt, Huriye Atilgan, Armin Lak

FENS Forum 2024