Midbrain dopamine neurons are well known for their strong responses to rewards and their critical role in positive motivation. It has become increasingly clear, however, that dopamine neurons also transmit signals related to salient but non-rewarding experiences such as aversive and alerting events. Here we review recent advances in understanding the reward and non-reward functions of dopamine.

Based on this data, we propose that dopamine neurons come in multiple types that are connected with distinct brain networks and have distinct roles in motivational control. Some dopamine neurons encode motivational value, supporting brain networks for seeking, evaluation, and value learning. Others encode motivational salience, supporting brain networks for orienting, cognition, and general motivation.

Both types of dopamine neurons are augmented by an alerting signal involved in rapid detection of potentially important sensory cues.

We hypothesize that these dopaminergic pathways for value, salience, and alerting cooperate to support adaptive behavior. The neurotransmitter dopamine DA has a crucial role in motivational control — in learning what things in the world are good and bad, and in choosing actions to gain the good things and avoid the bad things. The major sources of DA in the cerebral cortex and in most subcortical areas are the DA-releasing neurons of the ventral midbrain, located in the substantia nigra pars compacta SNc and ventral tegmental area VTA Bjorklund and Dunnett, In their tonic mode DA neurons maintain a steady, baseline level of DA in downstream neural structures that is vital for enabling the normal functions of neural circuits Schultz, In their phasic mode DA neurons sharply increase or decrease their firing rates for — milliseconds, causing large changes in DA concentrations in downstream structures lasting for several seconds Schultz, ; Schultz, As a result, these phasic DA reward signals have taken on a prominent role in theories about the functions of cortical and subcortical circuits and have become the subject of intense neuroscience research.

In the first part of this review we will introduce the conventional theory of phasic DA reward signals and will review recent advances in understanding their nature and their control over neural processing and behavior.

In contrast to the accepted role of DA in reward processing, there has been considerable debate over the role of phasic DA activity in processing non-rewarding events. Some theories suggest that DA neuron phasic responses primarily encode reward-related events Schultz, ; Ungless, ; Schultz,while others suggest that DA neurons transmit additional non-reward signals related to surprising, novel, salient, and even aversive experiences Redgrave et al.

In the second part of this review we will discuss a series of studies that have put these theories to the test and have revealed much about the nature of non-reward signals in DA neurons. In particular, these studies provide evidence that DA neurons are more diverse than previously thought. Rather than encoding a single homogeneous motivational signal, DA neurons come in multiple types that encode reward and non-reward events in different manners.

This poses a problem for general theories which seek to identify dopamine with a single neural signal or motivational mechanism. To remedy this dilemma, in the final part of this review we propose a new hypothesis to explain the presence of multiple types of DA neurons, the nature of their neural signals, and their integration into distinct brain networks for motivational control.

Our basic proposal is as follows.

One type of DA neurons encode motivational valueexcited by rewarding events and inhibited by aversive events. These neurons support brain systems for seeking goals, evaluating outcomes, and value learning. A second type of DA neurons encode motivational salienceexcited by both rewarding and aversive events. These neurons support brain systems for orienting, cognitive processing, and motivational drive.

In addition to their value and salience activity, both types of DA neurons also transmit an alerting signal triggered by unexpected sensory cues of high potential importance.

Together, we hypothesize that these value, salience, and alerting signals cooperate to coordinate downstream brain structures and control motivated behavior. Dopamine has long been known to be important for reinforcement and motivation of actions. Drugs that interfere with DA transmission interfere with reinforcement learning, while manipulations which enhance DA transmission, such as brain stimulation and addictive drugs, often acts as reinforcers Wise, DA transmission is crucial for creating a state of motivation to seek rewards Berridge and Robinson, ; Salamone et al.

One hypothesis about how DA supports reinforcement learning is that it adjusts the strength of synaptic connections between neurons.

This mechanism would allow an organism to learn the best choice of actions to gain rewards, given sufficient trial-and-error experience. Consistent with this hypothesis, dopamine has a potent influence on synaptic plasticity in numerous brain regions Surmeier et al.

In some cases dopamine enables synaptic plasticity along the lines of the Hebbian rule described above, in a manner that is correlated with reward-seeking behavior Reynolds et al. In addition to its effects on long-term synaptic plasticity, dopamine can also exert immediate control over neural circuits by modulating neural spiking activity and synaptic connections between neurons Surmeier et al.

In order to motivate actions that lead to rewards, dopamine should be released during rewarding experiences. However, the pioneering studies of Wolfram Schultz showed that these DA neuron responses are not triggered by reward consumption per se.

Thus, if a reward is larger Big naked mature predicted, DA neurons are strongly excited positive prediction error, Figure 1Ered ; if a reward is smaller than predicted or fails to occur at its appointed time, DA neurons are phasically inhibited negative prediction error, Figure 1Eblue ; and if a reward is cued in advance so that its size is fully predictable, DA neurons have little or no response zero prediction error, Figure 1Cblack.

The same principle holds for DA responses to sensory cues that provide new information about future rewards.

DA neurons are excited when a cue indicates an increase in future reward value Figure 1C red, inhibited when a cue indicates a decrease in future reward value Figure 1C blue, and generally have little response to cues that convey no new reward information Figure 1E black.

Computational models using a TD-like reinforcement signal can explain many aspects of reinforcement learning in humans, animals, and DA neurons themselves Sutton and Barto, ; Waelti et al. A Conventional theories of DA reward signals. These signals could be used for learning, to reinforce or punish previous actions backward arrows or for immediate control of behavior, to promote or suppress reward-seeking actions forward arrows.

B—E An example DA neuron with conventional coding of reward prediction errors as well as coding of the subjective preference for predictive information.

Data is from Bromberg-Martin and Hikosaka, B This DA neuron was excited by a cue indicating that an informative cue would appear to tell the size of a future reward red. C DA excitation by a big reward cue redinhibition by a small reward cue blueand no response to predictable reward outcomes black. D This DA neuron was inhibited by a cue indicating that an uninformative cue would appear which would leave the reward size unpredictable blue.

E DA lack of response to uninformative cues black, excitation by an unexpectedly big reward red, and inhibition by an unexpectedly small reward blue.

An impressive array of experiments have shown that DA signals represent reward predictions in a manner that closely matches behavioral preferences, including the preference for large rewards over small ones Tobler et al. There is even evidence that DA neurons in humans encode the reward value of Big naked mature Zaghloul et al.

Furthermore, DA signals emerge during learning with a similar timecourse to behavioral measures of reward prediction Hollerman and Schultz, ; Satoh et al. Big naked mature findings have established DA neurons as one of the best understood and most replicated examples of reward coding in the brain. As a result, recent studies have subjected DA neurons to intense scrutiny to discover how they generate reward predictions and how their signals act on downstream structures to control behavior.

Recent advances in understanding DA reward signals come from considering three broad questions: How do DA neurons learn reward predictions?

How accurate are their predictions? And just what do they treat as rewarding? Classic theories suggest that reward predictions are learned through a gradual reinforcement process requiring repeated stimulus-reward pairings Rescorla and Wagner, ; Montague et al.

