Schropp2019 - Target-Mediated Drug Disposition Model for Bispecific Antibodies.

November 2019, the model of the month by Johannes Meyer
Original model: BIOMD0000000635

Background

Reinforcement learning is a version of association formation whereby an organism, drawing from prior experience, amends the salience of an environmental stimulus by associating it with either the stimulus or the reward itself. Dopamine is a crucial for the functioning of reinforcement learning, with transients in the concentration of this neurochemical modulator mediating essential aspects of reward-related signalling. One such aspect is reward prediction error (RPE), which represents the difference between the expected and obtained reward. RPE is usually represented by the activity of midbrain dopaminergic neurons (DANs) projecting onto the brain striatum. The firing of DANs is associated with salient cues or unexpected rewards (a positive RPE), and the omission of a reward (negative RPE) gives rise to a decrease in DAN activity [1]. Conveyance of this signalling within the neuron is thought to be enabled by a brief respite in the firing of tonically active cholinergic interneurons, as well as more longer-term changes in concentrations of neuromodulators such as adenosine.

Medium spiny neurons (MSNs), important in mediating plasticity, are replete with adenylyl cyclase (AC) coupled GPCRs sensitive to concentrations of these neuromodulators. There exist two types of MSN: striatonigral MSNs expressing D1 receptors (D1R) coupled to Golf (D1+ MSNs), and those expressing D2 receptors (D2R) coupled to Gi/o (D2 MSNs). Striatal D1R is usually co-expressed with muscarinic M4 acetylcholine receptors (M4R), also coupled to Gi/o, and D2R is typically also expressed with adenosine A2a receptors (A2aR), which are coupled to Golf   [2].

There exists relatively little knowledge and understanding concerning how transient changes in the concentration of these neuromodulators affect cAMP/PKA signalling via Golf and Gi/o-coupled GPCR within the cytosol. The relationship between these neuromodulators and subsequent intracellular signalling is studied through a quantitative kinetic approach with the present model, published by Nair et al [3].

Model

The authors have presented two structurally similar but functionally distinct models, each intended to study a specific receptor pair (Fig. 1). The first model examines the interaction between D1R and M4R in D1+ MSNs, while the second studies the interaction between D2R and A2aR in D2+ MSNs. For the D1+ MSN model, two inputs were used: the concentration of dopamine (DA) and acetylcholine (ACh). For the D2+ MSN model, DA and adenosine (Adn) were used as inputs. Different input patterns were used for each model. For the D1+ MSN model, a low tonic basal level or a transient increase (burst) was used for DA, as well as a temporary dip in ACh concentration. The D2+ model uses two patterns for each input: a transient increase or dip for DA, as well as a tonic basal level or burst of Adn. Inputs for both models converge on AC5 and the subsequent cAMP/PKA cascade following signal transduction through GPCRs.

Figure 1

Figure 1. Structure of the model. Reproduced from [3]

Results

The D1+ MSN model used DA and ACh binding to their cognate receptors as inputs for the model, with each neurotransmitter ultimately increasing or decreasing the levels of intracellular cAMP and subsequent PKA activity, respectively. It was demonstrated that a transient peak in DA concentration, in the presence of high tonic ACh (100 nM), is insufficient to cause significant PKA activation. However, if a ACh concentration dip and a DA peak were to occur simultaneously, then high PKA activity was achieved (Fig. 2). Furthermore, it was found that both the amplitudes of the ACh dip and DA peak contributed to the synergy with which PKA stimulation was achieved, with the ACh dip becoming more important at higher synergies. This suggests that these two signalling pathways act as an AND gate, thereby filtering out brief DA pulses that are not accompanied by a concomitant decrease in ACh.

Figure 2

Figure 2. Normalised active PKA levels in D1+ MSNs produced in response to different patterns of dopamine and acetylcholine input. The DA peak signal with a duration of 1 second increases from a basal 10 nM to a maximum of 1500 nM; the ACh dip signal with a duration of 0.4 seconds decreases from a basal 100 nM to a minimum of 1 pM. The y-axis is normalised to a steady-state level of active PKA produced in response to saturating D1R agonist in the presence of basal ACh.

Like D1+ MSNs, the D2+ MSNs are also capable of detecting changes in extracellular DA via D2R. However, D2R is coupled to Gi/o, which inhibits AC5 and subsequent cAMP production. The D2+ MSN model allowed the investigation of the effects of transient changes in extracellular DA on PKA activation in D2+ MSNs, a relationship of which comparatively little is known. It was found that, because of the high-affinity fraction of D2R already present, coupled with the basal level of stimulation of this fraction, a transient increase in DA did not produce a significant decrease in PKA level. Conversely, however, a temporary dip in extracellular DA concentration did yield a marked increase in PKA activation, an effect that was further pronounced if accompanied by a simultaneous increase in extracellular Adn concentration (Fig. 3). Although this gives evidence of synergism between concomitant DA and Adn transmission, the relationship is less synergistic due to the ability of a DA dip alone to cause significant PKA activation.

Figure 3

Figure 3. Normalised active PKA levels in D2+ MSNs produced in response to different patterns of dopamine and adenosine input. The DA dip signal with a duration of 1 second decreases from a basal 10 nM to a minimum of 1 pM; the Adn peak signal increases from a basal 150 nM to a maximum of 1000 nM. The y-axis is normalised to a steady-state level of active PKA produced by saturating A2aR agonist in the presence of basal DA.

Conclusion

Considering the opposite effects of DA signalling with regards to PKA activation in D1+ (DA peak) versus D2+ MSNs (DA dip), the evidence suggests that these two pathways are responsible for mediating positive and negative RPE, respectively. While the G-protein system is similarly configured between the two MSN types (Gi/o with high basal activation coupled with a gating of Golf by a dip in neuromodulator), the opposite functionality can be explained by the different intracellular G-protein mediators activated upon DA binding to its cognate receptor.

Additionally, a dip in the neuromodulator normally mediating basal inhibition of AC5 acts as a highly dynamic gate via fast inactivation of Gi/o protein, emphasising the important role that GAP proteins may play in MSNs and reinforcement learning.

Beyond the implications for reinforcement learning, the basal inhibition exerted by Gi/o upon AC and the gating of Golf signalling by a transient dip in Gi/o could be viewed as a more general means whereby random, spurious Golf signal ‘noise’ could be filtered out, something which could well be further investigated and developed for cell types other than MSNs.

References

  1. Schultz W. 1998. Sensing Positive versus Negative Reward Signals through Adenylyl Cyclase-Coupled GPCRs in Direct and Indirect Pathway Striatal Medium Spiny Neurons.. J Neurophysiol 80:1-27.
  2. Ince E, Ciliax BJ, Levey AI. 1997. Differential expression of D1 and D2 dopamine and m4 muscarinic acetylcholine receptor proteins in identified striatonigral neurons.. Synapse 27:357–66.
  3. Nair AG, Gutierrez-Arenas O, Eriksson O, Vincent P, Hellgren Kotaleski J. 2015. Sensing Positive versus Negative Reward Signals through Adenylyl Cyclase-Coupled GPCRs in Direct and Indirect Pathway Striatal Medium Spiny Neurons. J Neurosci 35:14017–30.