Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jun;570(7762):509-513.
doi: 10.1038/s41586-019-1261-9. Epub 2019 May 29.

Specialized coding of sensory, motor and cognitive variables in VTA dopamine neurons

Affiliations

Specialized coding of sensory, motor and cognitive variables in VTA dopamine neurons

Ben Engelhard et al. Nature. 2019 Jun.

Abstract

There is increased appreciation that dopamine neurons in the midbrain respond not only to reward1 and reward-predicting cues1,2, but also to other variables such as the distance to reward3, movements4-9 and behavioural choices10,11. An important question is how the responses to these diverse variables are organized across the population of dopamine neurons. Whether individual dopamine neurons multiplex several variables, or whether there are subsets of neurons that are specialized in encoding specific behavioural variables remains unclear. This fundamental question has been difficult to resolve because recordings from large populations of individual dopamine neurons have not been performed in a behavioural task with sufficient complexity to examine these diverse variables simultaneously. Here, to address this gap, we used two-photon calcium imaging through an implanted lens to record the activity of more than 300 dopamine neurons from the ventral tegmental area of the mouse midbrain during a complex decision-making task. As mice navigated in a virtual-reality environment, dopamine neurons encoded an array of sensory, motor and cognitive variables. These responses were functionally clustered, such that subpopulations of neurons transmitted information about a subset of behavioural variables, in addition to encoding reward. These functional clusters were spatially organized, with neighbouring neurons more likely to be part of the same cluster. Together with the topography between dopamine neurons and their projections, this specialization and anatomical organization may aid downstream circuits in correctly interpreting the wide range of signals transmitted by dopamine neurons.

PubMed Disclaimer

Conflict of interest statement

Competing interests

The authors declare no competing interests.

Figures

Extended Data Figure 1.
Extended Data Figure 1.. Features of the VR task, encoding model predictions, and selection of the encoding model.
a, Example screenshots of the virtual world presented to the mouse in different positions along the maze. b, Activity trace during 6 consecutive trials of an example neuron that was significantly modulated by position in the central stem. The colored strip below the trace describes the trial epochs: cue period (gray), delay period (blue), outcome period (pink). Reward delivery is denoted by a water droplet. c, ΔF/F traces for 10 example neurons during 6 consecutive trials (green). Overlaid are the predictions of the behavioral model for these trials (blue). The colored strip below each trace denotes the trial epochs: cue period (gray), delay period (blue), outcome period (pink). Reward delivery is denoted by a water droplet. d, Mean (across neurons) of percent variance explained (tested on held-out data with 5-fold crossvalidation) by the final model (red) and other models where a variables was either removed (blue) or added (green). See Methods for descriptions of all variables that were tested. All models for which a variable was removed from the final model performed significantly worse, based on comparing R2 for all neurons (p<2×10−6, 2-sided paired t-test, n=303, Holm-Bonferroni correction for all model comparisons). For models where variables were added to those in the final model, the performance either did not exhibit a significant difference, or was degraded. See Methods for complete description of all models. e, Comparison of performance for all neurons of the final model (x-axis) and all the other models. Each panel shows the comparison with one model; significance of the 2-sided paired t-test (After Holm-Bonferroni correction) is shown in each panel. n=303 in all cases.
Extended Data Figure 2.
Extended Data Figure 2.. Simultaneous calcium imaging and cell-attached recording in DA neurons in the VTA of Ai148×DAT::Cre mice.
a, Relative change in fluorescence (top) and cell-attached current (bottom) recorded simultaneously. b, Average spike-triggered fluorescence (average over n=126 spikes). c, Zoomed in spike waveform for the same cell as in (a). d, Examples of bursts from 3 different DA cellsdifferent DA cells, showing cell-attached current (top) and change in fluorescence (bottom). The spike times are shown with black bars under the fluorescence trace. The red horizontal bars under the current traces show the timing of NMDA puffs (see Methods).
Extended Data Figure 3.
Extended Data Figure 3.. Motion correction procedure.
We developed a custom motion correction procedure to compensate for both non-rigid slow drift of the field of view (timescale: 10s of min) as well as non-rigid fast motion (timescale: 10s of ms). Importantly, the procedure avoids any use of interpolation, which can produce artifacts. The procedure consists of the following main steps: 1 (blue box) the entire movie is divided in non-overlapping 50 s chunks; in each chunk we perform rigid motion correction using standard cross-correlation methods (on the red channel). The template for each chunk is calculated by dividing the chunk into non-overlapping sections of 100 frames, calculating the mean image of each section, and obtaining the median of the mean images. 2 (red box) we use a non-rigid algorithm for image registration to align all the templates. The algorithm outputs shift parameters for every pixel and template. Separately, we manually draw patches that include neurons of interest in the first template. For each template, we use the shift parameters of all the pixels in each patch to estimate the average motion of the patch. We use that information to crop the patch from each 50 s chunk of the movie. 3 (orange box) we perform rigid motion correction (as above) on the concatenated patch movies, down-sample by a factor of 2 (to increase the signal strength) and then perform rigid motion correction again. 4 (green box) we extract the patch templates by using the mean projection, and hand draw ROIs of the objects of interest. See Methods for a detailed explanation of motion correction algorithm, and see Supplementary Video 2 for an example video before and after correction. Code available in: https://github.com/benengx/Deep-Brain-Motion-Corr.
Extended Data Figure 4.
Extended Data Figure 4.. Recovered neuron locations and validation of the spatial organization of neural responses.
a, Example of lens location recovery. Coronal histological slices stained for Tyrosine hydroxylase were aligned to the Allen brain Atlas using the Wholebrain software package . The center of the lens was marked and its position in common coordinates was recovered by using the software. b, Left: recovered centers of GRIN lenses from all mice (black ellipses) are shown on top of the atlas images. Right: recovered locations of all neurons that entered the clustering analysis based on an encoding model R2 during the cue period >5% (n=233; see Methods for details on location recovery). Neurons are color-coded according to their cluster identity. c, Relative contributions of each behavioral variable as a function of neuron location along the A/P,M/L and D/V axes. In each row, the relative contribution of a behavioral variable is correlated with the A/P (left), M/L (middle) of D/V (right) locations. The correlation value and significance (after Holm-Bonferroni correction for all tests) is shown in the panel (n=233 in all cases). The linear fits of the entire population is shown by a black line, and linear fits of neurons belonging to individual mice (which had more than 5 neurons) are shown by gray lines. d, statistical tests of the spatial organization of responses to different behavioral variables that account for individual differences across mice. The table lists the p-values and F-statistics obtained for 3 statistical tests for the spatial organization of the cue-period variables. The first test was a mixed effect model which included all neurons that had good fit to the behavioral model during the cue period (R2>5%, n=233). In this model, the relative contribution for a given variable to each neuron was the dependent variable, the A/P, M/L, D/V locations and their pairwise interactions were independent fixed effects, and the mouse identity was a random effect for the intercepts (MATLAB code: model = fitglme(Data,’variable~ml*ap*dv-ml:ap:dv+(1|mouseID)’)). For this test, the degrees of freedom for the numerator and denominator respectively were 6 and 226. In the Field of View (FOV) tests, for every variable we averaged the relative contributions of all neurons in a given FOV. (For mice that had two FOVs we combined neurons from the two FOVs). A regression was run with the average relative contributions as the dependent variable, and the A/P, M/L, D/V of the lens locations and their pairwise interactions were independent fixed effects (n=19). In the weighted version of the FOV test, we additionally weighed each FOV observation by the number of neurons in that FOV. For these two tests, the degrees of freedom for the numerator and denominator respectively were 6 and 12. In all cases the listed p-values correspond to the F-test for the fixed effects.
Extended Data Figure 5.
Extended Data Figure 5.. Average activity and relative contributions of different behavioral variables for several example cells.
The panels show activity averages time-locked to different behavioral variables for 6 example cells. The percentage of relative contribution of the corresponding behavioral variable to the activity of each cell is displayed in each panel.
Extended Data Figure 6.
Extended Data Figure 6.. Additional analyses of neural encoding.
a, Distributions of the negative log-likelihood of the clustering model (Fig. 3) for shuffled (gray) versus real data (red) indicates a significant fit of the clustering model. Top: Shuffling of relative contributions across variables. Bottom: Shuffling across neurons. b, Prediction of choice and accuracy from neurons in each cluster. For each neuron, decoding was performed by logistic regression using the average cue period activity (on a trial-by-trial basis) to predict choice or accuracy. Regression was performed using 10-fold crossvalidation (over trials). Separate decoders were trained to predict either choice or accuracy. Weighted decoding was used to control for the different number of trials of each type (left/right choices or correct/incorrect trials; see Methods). Each panel shows a histogram of the decoding performance for a given variable (left column: choice, right column: accuracy) and a given cluster (rows). Gray vertical lines indicate 50% performance (chance level). Vertical yellow lines indicate the median of the distribution. Significance was assessed by a 2-sided Wilcoxon signed rank test and is presented after a Holm-Bonferroni correction for the 10 tests. For clusters 1 through 5, n= 74, 36, 27, 27, and 26 respectively. The predictive power of the different clusters is broadly consistent with their association with the different behavioral variables: choice was significantly predicted by neurons belonging to clusters 1 (associated primarily with kinematics, which contains the view angle component that is strongly related to choice) and 3 (associated primarily with cues, which determine choice for successful trials). The strongest predictive power for the mice’s accuracy is exhibited by cluster 5, which is primarily associated with accuracy.c, Noise correlations estimated by an alternative method. Here, noise correlations were estimated by calculating the increase in variance explained by the behavioral-only encoding model when the second neuron activity was added to it as a predictor ,. The noise correlation estimate is shown for all neuronal pairs (n=1492) during the cue period (left) and outcome period (right). d, To investigate the possible effect of neuropil contamination on the observed relationship between pairwise correlations and distance (Fig. 4), we systematically varied the neuropil correction factor from 0 to 1 and recalculated the relationship between correlations and interneuronal distance for the different conditions. In all cases, we find a similar pattern to the one presented in the main text: 1- A significant negative slope between distance and signal and noise correlations in the cue period. 2- A significant negative slope between distance and as noise correlations in the outcome period. 3- No relationship between distance and signal correlations in the outcome period. e, To investigate the relationship between task performance and neural encoding, mice were divided into 2 groups based on their task performance. The relative contributions of the behavioral variables were averaged separately for neurons belonging to the mice in each group. Consistent with modulation by reward expectation, we found that cue-related activity was stronger and reward responses were weaker in the top performing mice. Interestingly, previous reward (which does not provide useful information for task performance) was more strongly represented in the bottom performing mice (2-sided Wilcoxon signed rank test, n1=129 neurons in the top performing mice, n2=104 in the bottom performing mice, with Holm-Bonferroni correction for the 6 tests). f, To investigate the relationship between instantaneous performance and neural encoding, for each session, all trials were grouped into blocks of 10 consecutive trials with no overlap; these blocks were split into two groups based on whether the average performance in the block was greater or less than the median performance across all blocks in that session. The panel shows the relative contributions of all behavioral variables calculated separately for the better- or worse- performance blocks. The results did not show a significant difference for any of the variables (2-sided Wilcoxon signed rank test, n1=n2=233 neurons, with Holm-Bonferroni correction for the 6 tests), suggesting that the instantaneous performance of each mouse does not have a large effect on the strength of representation of the different variables.
Extended Data Figure 7.
Extended Data Figure 7.. Validation of the clustering procedure and encoding model.
a, Summary of average relative contributions of the different behavioral variables for neurons belonging to each cluster as calculated via the approach used in the paper (no-refitting; see Methods). Left: Average relative contributions of cue period behavioral variables to neural activity for each cluster. Right: average relative contribution of reward for each cluster. b, Same as a, but for the clustering analysis performed on the contributions calculated using the refitting approach (see Methods). c, Normalized confusion matrix for the cluster identities of each neuron, obtained by comparing the clustering of the relative contributions based on either the no-refitting or the refitting approach (see Methods for description of 2 approaches). The main diagonal represents neurons for which the cluster identities matched (97.8%). d, Average relative contributions of clusters obtained by separately analyzing two random halves of the trials for each neuron. Correlations between the average relative contributions in each cluster across the two sets are as follows (n=5 in all cases): Position: ρ = .99, p < 8×10−5. Cues: ρ = .99, p < 4×10−4, Kinematics: ρ = .99, p < 2×10−4. Accuracy: ρ = .99, p < 3×10−4. Previous Reward: ρ = .99, p < 0.001. Reward Response: ρ = .48, p < 0.42. e, Normalized confusion matrix for the cluster identities of each neuron, obtained by clustering the two random halves of the data. The main diagonal represents neurons for which the cluster identities matched (79.1%). Note that chance level of matching is 20%. The matrix was calculated for neurons for which a cluster was assigned in the procedures for both halves of the data (>75% probability to belong to a cluster, n=91). f, Average absolute value of the correlations for all pairs of predictors across all behavioral variables during the cue period (average across all predictor pairs and mice). g, Average relative contributions assessed separately using 3 different approaches: 1- No refitting (NR; used in the paper). 2- No refitting + LASSO regularization (NR+L). 3- Refitting (R). Correlations between the results of the different approaches are as follows: ρ(NR,NR+L) = 1, p < 7×10−9. ρ(NR,R) = .99, p < 1×10−4. ρ(NR+L,R) = .99, p < 8×10−5 (n=6 in all cases). When omitting the reward response contributions: ρ(NR,NR+L) = 1, p < 2×10−5. ρ(NR,R) = .91, p < 0.04. ρ(NR+L,R) = .92, p < 0.03 (n=5 in all cases). Lasso regularization was applied using the ‘lasso’ function in Matlab; the mean square error (MSE) of the model was estimated using 5-fold crossvalidation, and we chose the lambda value that minimized the MSE. The results with lasso regularization were almost identical to the result without regularization, suggesting that there was not significant overfitting in our model. h, Average relative contributions assessed separately using two random halves of the data. For each neuron we randomly divided all the trials where the neuron was recorded into 2 separate subsets while matching the number of rewarded and previously rewarded trials between the subsets. Each subset of trials was then used to calculate the relative contributions of the behavioral variables. (ρ = .99, p < 3×10−4 for all behavioral variables (n=6), ρ = .8, p < 0.11 when omitting the reward response contributions (n=5)). i, We tested the robustness of the clustering results by performing an alternative clustering procedure based on the predicted neuronal traces. The panel depicts the analysis pipeline for this clustering approach: after learning the regression weights for all neurons, behavioral predictors from one session were used to generate predicted activity traces for all neurons. A similarity matrix was constructed by taking the absolute correlation between the predicted traces for each neuronal pair. The similarity matrix was clustered using information-based clustering (see Methods) and ordered by the obtained clusters (right panel; cluster identity for each neuron depicted by a colored stripe to the right of the panel). j, Normalized confusion matrix for the cluster identities of each neuron, comparing the cluster identity obtained by clustering the relative contributions (method used in the main text; Fig. 3) and the alternative method described here (clustering the similarity matrix obtained from the predicted neuronal traces). The two clustering methods involve conceptual differences which may result in different clustering organizations. For example, the method used in Fig. 3, which clusters the relative contributions of the behavioral variables, is independent of a particular tuning for these variables, while the method presented here should be affected by such tuning (e.g. upward vs downward position ramps). Nevertheless, we find a similar overall clustering structure between the two methods, with the following main differences: 1- original clusters 3 and 5 (associated with previous reward and accuracy) are joined in a single cluster (new cluster 5). 2- Original cluster 1 (associated with kinematics) is now split into 2 clusters (new clusters 1 and 3). Further investigation of the split of the kinematics cluster showed that the neurons that split from the main kinematics cluster have stronger modulation for the view angle component of kinematics (based on the regression coefficient values). Such a split could not occur in the formulation used in the main text which combined all the kinematics components (speed, acceleration and view angle). k, Further validation of the encoding model by simulating data with known relative contributions of the different behavioral variables. We replaced the activity of each neuron by a simulated trace that was computed using known relative contributions of the different behavioral variables as follows: first, the predictors corresponding to each behavioral variable were summed, resulting in one predictor per variable. Each of these predictors was z-scored and multiplied by a different relative contribution (taken from the values obtained for the real data). The scaled predictors were then summed, resulting in a single vector which forms the basis of the firing rate of the simulated neuron. To this vector we added a constant in order to obtain an average firing rate close to 5 Hz (which was observed in in-vivo electrophysiological recordings ). After zeroing negative values of this firing rate vector we used it to generate a spike train using a Poisson process. Finally, the spike train was convolved with an approximate GCaMP kernel (see Methods). We proceeded to estimate the relative contributions for the simulated trace using the encoding model procedure. Each panel shows the relative contributions used to simulate the traces (x-axis) and the recovered contributions (y-axis) for a given behavioral variable; the correlation between the original and recovered relative contributions and its associated p-value are denoted in each panel (n=233 in all cases).
Extended Data Figure 8.
Extended Data Figure 8.. Evolution of neural responses throughout learning.
a, Schematic of the shaping protocol. Training consisted of 9 mazes with increasing task difficulty. In the first 5 mazes, cues were permanent and were visible from the beginning of the trial (but still became progressively bigger as the mouse approached them). From maze 6 onward, cues only appeared when the mouse approached within 10 cm of their location. From maze 7 onward, cues could also appear on the unrewarded side. Cues were randomly distributed along the cue region. The number of cues on each side was sampled from a Poisson distribution with the mean indicated for each maze. b, Task performance, model fit, and relative contributions of the behavioral variables throughout learning. The total number of neurons, the number of neurons with good model fit during the cue period (R2>5%; these were used to calculate the relative contributions of the behavioral variables during the cue period), and the number of mice analyzed in each training stage are indicated at the top. Shaded colors are s.e.m. The results showed that task performance increased steadily across the permanent cue mazes, and then dropped in the first transient cue maze, most likely due to the working memory component that is added in the transient cue mazes. The overall R2 of the behavioral model increased across learning, indicating that over training, neural activity could be better explained by the measured behavioral variables. Interestingly, the relative contribution of position increased monotonically during the permanent cue mazes, but then dropped during the transient cue mazes, similar to the animals’ performance across the mazes. This is consistent with the interpretation of positional ramps as reflecting a value signal , since the expected value at each position is closely related to reward expectation for that session, and reward expectation is determined by average task performance. The relative contributions for cues also increased during early learning, consistent with being a reflection of the strength of the cue-reward association. Note that this value is somewhat decreased in the last maze, in which (because of the increased task difficulty) each cue has a lower predictive power with respect to reward. The relative contribution of previous reward decreased across the permanent cue mazes, then transiently increased during the first transient cue session. Since relying on previous reward is the wrong strategy in this task, this decrease in the relative contribution of previous reward may relate to animals weighting previous reward more heavily during the major steps in training when they have not yet learned the correct strategy for solving the task. The relative contribution of kinematics declined over the training procedure. This may be due to the kinematic aspect of the behavior becoming less variable over training, as the animal’s motor skills improved for VR navigation. Interestingly, the relative contribution of trial accuracy was significantly higher during the transient cue mazes than the permanent cue mazes. This result potentially suggests that DA activity is correlated with task performance preferentially when there is a working memory component. The reward response declined during the permanent cue mazes, and remained relatively consistent during the transient cue mazes; this is consistent with an RPE signal, since RPE implies negative modulation of reward responses by reward expectation (and reward expectation is related to task performance). c, Proportion of neurons that were significantly modulated by the different behavioral variables throughout learning (see Methods). Shaded colors show the 1 STD confidence intervals for a binomial distribution calculated using Jeffreys method. d, Details of the shaping procedure. The table lists the parameters of the mazes progressively used during the shaping of the behavior. The “permanent cues” field indicates if the cues were presented at the beginning of the trial; otherwise, each cue was presented when the mouse was 10 cm away from its location. “High- (and low) -cue-probability side mean” indicates the means of the Poisson distribution from which the number of cues presented on each side were drawn (at least 1 cue was always drawn); “none” indicates that no cues were presented for the low-probability side on any trial in that maze. The mice were automatically advanced to the next maze if the following criteria were met: 1- their performance was above a predetermined threshold (“minimum performance for advancing” field) for a given number of trials (“number of trials to calculate performance” field). 2- They completed at least n sessions in the current maze, where n is given by the “minimum number of sessions for advancing” field.
Extended Data Figure 9.
Extended Data Figure 9.. Neural responses related to position, cues, and accuracy throughout learning.
a, For each behavioral variable (position, cues and accuracy), each heatmap contains all significant neurons for that maze, with each row representing the average response of one neuron (each neuron’s activity is normalized by its peak). Statistical significance is assessed by comparing the F-statistic obtained from a nested model comparison with or without each behavioral variable to a distribution of the same F-statistic obtained from shuffled data (see Methods). In the case of position and accuracy, the averaging is over trials. In the case of cues, the averaging is across cue occurrences, and the average baseline activity was subtracted (in the second preceding the cue occurrence). The number of significant and total neurons for that variable and maze are indicated at the top of each heatmap. The height of the heatmaps for each maze is proportional to the average fraction of significant neurons (across variables) for that maze. b, Changes in tuning across learning. Left: percentage of neurons with significant responses to position that exhibited a positive slope in their average response. Middle: percentage of neurons with significant responses to cues that exhibited higher response to contralateral cues (compared to ipsilateral cues). Right: percentage of neurons with significant responses to accuracy that exhibited higher response in error trials (compared to correct trials). Shaded colors show the 1 std. dev. confidence intervals for a binomial distribution calculated using Jeffreys method. The horizontal dotted lines indicate 50% in each panel. Position-selective neurons exhibited early in training more downward ramps than upward ramps (left panel, mazes 2 & 3). Since upward and not downward ramps are consistent with a value signal ,, this result suggests an evolution in the specific tuning -and not only the strength of representation- of this variable that is consistent with a value signal. Throughout training, cue-selective neurons are mostly selective for either contralateral or ipsilateral cues, and the preferential representation of contralateral cues develops late in training. This is interesting, because selectivity for contralateral vs ipsilateral cues is not a prediction of the RPE framework. Accuracy-selective neurons exhibit a strong bias towards elevated activity for error trials versus correct trials which was evident by the last permanent cue maze.
Extended Data Figure 10.
Extended Data Figure 10.. Specific expression of GCaMP6f in midbrain dopamine neurons in the Ai148×DAT::cre mouse line.
a, Example GCaMP6f expression (green) and TH antibody staining (red). Square indicates location of high-magnification view of GCaMP expression in TH+ neurons. Upper scale bar: 500 μm. Lower scale bar: 100 μm. b, Quantification of penetrance and specificity of Ai148×DAT::cre line. Penetrance is the number of TH+ neurons also expressing GCaMP (mean: 95.2%; s.e.m.: 1.52%; n=11 sections (1082 cells, 2 mice)). Specificity is the number of GCaMP+ neurons that are also TH+ (mean: 96.7%; s.e.m.: 0.74%; n= 11 sections (1075 cells, 2 mice)). c, Examples of lesions caused by GRIN lens implants (left). Insets are higher magnification images of the regions where TH+ neurons were counted underneath the lens and compared to counts contralateral to the lens. Scale bar: 50μm. White overlay indicates location of the lesion. Cells were counted in 50 μm by 50 μm squares from 0 to 300 μm below the lens. d, Average number of TH+ neurons per 50 μm2 by distance from the bottom of the lens. Orange: average count under the lens. Gray: average count from the contralateral hemisphere. Shaded colors are s.e.m. n = 11 mice.
Figure 1.
Figure 1.. 2-photon imaging of VTA DA neuron during navigation and decision-making in virtual reality.
a, Schematic of the experimental setup. b, Schematic of an example trial. In the central stem of the maze, the mouse is presented with transient visual cues to either side (“cue period”). Turning to the arm with more cues results in reward delivery, while turning to the other arm results in a tone and a 3s timeout. c, Fraction right choices based on the difference in right vs left cues in each trial. Gray are all individual sessions used in this paper; black are mean, s.e.m., and logistic fit to the mean across sessions. d, Schematic of the surgical strategy. e, Fields of view for 4 example mice. Scale bar: 20um. f, Left: Simultaneous imaging of GCaMP and mCherry in another example animal, with 4 neurons demarcated. Right: traces from those 4 neurons during 6 consecutive trials. Bars below the traces indicate within-trial epochs: cue period (grey), delay period (blue), outcome period (pink). Water drop: reward delivery.
Figure 2.
Figure 2.. Quantifying VTA DA neuron responses to specific behavioral variables in the task.
a, Neural activity in relation to the following behavioral variables: position along the central stem of the maze, kinematics (speed, acceleration, view angle), cues (contralateral or ipsilateral to the recording side), accuracy (if the mouse made the correct choice at the end of the maze), previous trial reward (if the previous trial was rewarded), and reward (versus not). For each variable, the upper panel is the average ΔF/F of an example neuron while the lower panel contains all neurons significantly modulated by that variable, with each row representing the peak-normalized average response of each neuron (grey arrow indicates example neuron within heatmap). See Methods for statistics and averaging. b, Schematic of encoding model used to quantify the relationship between behavioral variables and activity of each neuron (see Methods). Inset: predicted and actual ΔF/F across 5 trials for one neuron; additional examples in Extended Data Fig. 1c. c, Relative contribution of each behavioral variable to explained variance of the neural activity, averaged across neurons. d, Same as c, but full distribution. All error bars are s.e.m.
Figure 3.
Figure 3.. Functional and spatial organization of VTA DA neurons.
a, The clustering procedure. Left: Relative contribution of each behavioral variable to explained variance of neural activity for each neuron, before clustering (all neurons and variables are shown). Right: Same data grouped based on GMM clustering (ordered within each cluster by each neuron’s probability to belong to the cluster). Colored vertical lines on the right denote cluster identity. Neurons with <75% probability to belong to any cluster not assigned to a cluster (<18% of neurons unassigned). Bottom middle: BIC scores used to select the optimal number of clusters. b, Histogram of the number of behavioral variables during the cue period for which neurons were significantly modulated by, for all neurons (grey) and for the subset of neurons with significant reward response (pink). c, Recovered locations within the VTA of each neuron along the A/P and M/L axes. Cluster identity denoted by color. d, Relative concentration of neurons belonging to each cluster across the A/P (left) and M/L (right) axes. Dashed lines indicate 95% confidence interval (see Methods).
Figure 4.
Figure 4.. Spatial organization of signal and noise correlations in VTA DA neuron pairs.
a, Schematic of the expanded encoding model (behavioral + network model) which includes one additional predictor compared to that in Fig. 2b: the 1st principal component of the activity of all simultaneously recorded neurons other than the neuron being modeled. b, Comparison of the performance of the behavioral-only and the behavioral + network encoding models indicates high noise correlations. c, Signal and noise correlations for all simultaneously recorded pairs during the cue period (left) and the outcome period (right) as a function of the distance between the neurons (n=1492).
Figure 5.
Figure 5.. Two separable dimensions of reward expectation modulate reward responses in DA neurons during decision-making.
a, Schematic of the Pavlovian conditioning paradigm for data in panels b-d. b, An example cell where reward responses are modulated by expectation, consistent with RPE. d’ compares the unexpected and expected reward response (see Methods). c, Same as b, but average population response. d, histogram of d’ comparisons of unexpected and expected reward for all neurons. n=8 mice and n=65 neurons. e, In the VR T-maze, two dimensions of reward expectation were quantified: trial difficulty, and previous trial outcome. f, An example DA neuron modulated by both RPE dimensions. g, Same as f, but average population response. h, d’ histograms for both RPE dimensions for all reward-responsive neurons (n=232). i, Across the population, a significant (but noisy) correlation between the 2 dimensions of RPE. j, Reward responses in most functionally defined clusters are significantly modulated by RPE across at least 1 dimension, as shown by the average responses (left) and the d’ histograms (right; see Methods for details on significance). In all cases, shaded colors are s.e.m.

Similar articles

Cited by

References

    1. Cohen JY, Haesler S, Vong L, Lowell BB & Uchida N Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88 (2012). - PMC - PubMed
    1. Schultz W, Dayan P & Montague PR A Neural Substrate of Prediction and Reward. Science 275, 1593–1599 (1997). - PubMed
    1. Howe MW, Tierney PL, Sandberg SG, Phillips PEM & Graybiel AM Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature 500, 575–579 (2013). - PMC - PubMed
    1. Howe MW & Dombeck DA Rapid signalling in distinct dopaminergic axons during locomotion and reward. Nature 535, 505–510 (2016). - PMC - PubMed
    1. Barter JW et al. Beyond reward prediction errors: the role of dopamine in movement kinematics. Front. Integr. Neurosci. 9, (2015). - PMC - PubMed

Publication types

LinkOut - more resources

  NODES
Association 2
INTERN 1
Note 8
Project 3
twitter 2