Inferring synaptic plasticity rules from spike counts

In last week’s computational & theoretical neuroscience journal club I presented the following paper from Nicolas Brunel’s group:

Inferring learning rules from distributions of firing rates in cortical neurons.
Lim, McKee, Woloszyn, Amit, Freedman, Sheinberg, & Brunel.
Nature Neuroscience (2015).

The paper seeks to explain experience-dependent changes in IT cortical responses in terms of an underlying synaptic plasticity rule. A basic observation about IT is that novel stimuli tend to elicit a moderate distribution of spike counts, while familiar stimuli elicit more extreme responses: most stimuli elicit a weaker response but a few stimuli elicit a more vigorous response. (In excitatory neurons at least!). In other words, IT responses become more selective as stimuli become more familiar. Here’s a plot illustrating the phenomenon for an example IT neuron, showing marginal response distributions to a collection of novel and familiar stimuli — note that the blue curve (familiar stimuli) is mostly shifted to the left of the red curve (novel stimuli), except for a small blip at the right edge, indicating a handful of familiar stimuli to which the cell responds a lot:


Why does this happen? Lim et al seek to explain it in terms of changes in recurrent synaptic weights within IT. The basic setup is the beloved recurrent nonlinear rate model:

r_i = \Phi(h_i)                           (spike rate of i‘th neuron)
h_i = I_{iX} + \sum_j W_{ij} r_j          (input to i‘th neuron)

where we have:

\Phi(\cdot)  — transfer function, or “input-output nonlinearity”
I_{iX} —  feedforward input to neuron i from stimulus X.
W_{ij} — synaptic weight connecting neuron j to neuron i.

Lim et al derive update rules for W_{ij} from the measured spike count distributions P_{novel}(R) and P_{familiar}(R) (i.e., the curves in the figure above). If you understand how to transform probability densities into each other then you know all the math you need to understand how this works.

The basic fact we need from probability is the change-of-variables formula for transforming one probability density into another.  If I hand you a density f_R(R) and a monotonically increasing function \Phi(\cdot), then the density governing h, the input to \Phi(\cdot), such that the output R = \Phi(h) has desired density f_R is given by:

f_h(h) = f_R(\Phi(h)) \Phi' (h)

Lim et al exploit this fact to infer the distribution of inputs for novel and familiar stimuli, P_{novel}(h) and P_{familiar}(h), given \Phi(\cdot) and measured P_{novel}(R) and P_{familiar}(R). Then, they take the difference of the inferred distributions, P_{novel}(h) - P_{familiar}(h),  to obtain  the change in the amount of input to the neuron for each h value, which can be interpreted as a change in input for each R value (because R = \Phi(h).  Plotting P_{novel}(h) - P_{familiar}(h) as a function of R gives them the following curve (for this same example neuron):


This can be interpreted as a post-synaptic plasticity rule f_{post}(R) which tells us how to change the weights as a function of the post-synaptic neuron’s firing rate.  The derivation requires a little bit of math (and an assumption that the full plasticity rule is separable, i.e., can be written as a product of functions that depend on the pre- and post-spike rates:  \Delta W_{ij} = f_{post}(R_i) f_{pre}(R_j)).

But the intuition should be easy enough to get even without the math: it’s simply that, for this neuron, if your distribution of firing rates for novel stimuli is the red curve (top figure), and your goal is to transform it into the blue one as this set of stimuli becomes familiar, do this: for any stimulus that gives you less than 30 sp/s, weaken the synapses (by some small amount that depends on the presynaptic rate of the inputs); for any stimulus that gives you more than 30 sp/s, strengthen the active synapses. Thus, stimuli that initially drove you at >30 sp/s will end up driving you even more;  stimuli that initially drive you at <30 sp/s will eventually  drive you even less. This should eventually produce the blue curve from the red one, right?

(Hmm, I admit I don’t fully get why it the response distribution should stabilize and stop changing once it gets to the  blue curve — wouldn’t it keep changing the response distribution to become even more extreme? I’m not entirely sure what prevents that. If the paper gave an explanation then I’m afraid I missed it. Or maybe it doesn’t stop! I guess it’s an empirical question what happens as stimuli go from familiar to super familiar.)

The paper estimates the plasticity rule for every neuron, and shows that the majority of excitatory neurons have plasticity rules that look like the one shown above, but that (putatively) inhibitory neurons have purely negative plasticity rules.  (This accords with the fact that the firing rate distributions for inhibitory neurons shift entirely toward negative rates, with no increasing blip on the right tail). They also show that you can simulate a population of neurons with these plasticity rules, and the simulated population will behave just like the real neurons (i.e., in terms of the effects on the response distribution for familiar vs. novel stimuli).

Overall, it’s a very nice paper that takes some very simple math (with a simple intuition behind it) and uses it to derive a connection between stuff we can measure extracellularly and a theory / prediction about underlying biophysical mechanisms.

A Few Technical Comments:

  • Technically, the paper didn’t do quite what I described above. Instead of assuming a fixed nonlinearity \Phi(\cdot) mapping input h to R, they made an assumption that the distribution of inputs from novel stimuli P_{novel}(h) was Gaussian, and used that to estimate \Phi(\cdot) for each neuron.  Then they used the estimated \Phi(\cdot) to compute P_{familiar}(h) from the measured P_{familiar}(R). But it’s the same basic idea. Basically, the three objects P(h), \Phi(\cdot), and P(R) are connected.  If I tell you any two of them you can infer the third.  Personally it seems a bit strange to me that the authors decided to take this approach: it assumes something about the distribution of synaptic input currents for novel stimuli, and uses this to derive a unique nonlinearity for each neuron. But I would have thought it more natural to assume that all neurons common nonlinearity (after all, we have some theory about what these nonlinearities should look like for various spike models, e.g., the “escape rate approximation” from Plesser & Gerstner 2000 or more recently, Aviel & Gerstner 2006), and use those to infer the input distributions for both sets of stimuli, rather than assuming we know one of them a priori. I don’t regard this as crucial — the supplement shows that the results don’t change that much qualitatively if you assume different distributions for P_{novel}(h).
  • If we want to find something else to pick on, the main one I’d single out is the assumption that all the plasticity lives here in IT cortex.  That is, under this model, the change in response distributions to novel vs. familiar stimuli is entirely due to changes in recurrent weights in IT.  It assumes that the inputs to IT don’t change at all when stimuli go from novel to familiar. But I’d be surprised if this were true. In fact someone must know this — do V4 response distributions differ for novel vs. familiar stimuli?) So it would seem more plausible that there’s probably some change in the input to IT in addition to changes in the recurrent weights in IT, and the effects we observe result from combination of these two changes. Of course, I think it’s still a very interesting exercise to do what this paper did, which is to show what the learning rule should be if we attribute all the changes to the recurrent weights in IT. And it opens the door to an extension of the theory in order to integrate measurements from V4 (or elsewhere) with the simple framework described here.
  • Some other assumptions we could worry about: are the synaptic plasticity rules really well described as separable? More importantly, does the rank-ordering of stimuli really not change as they go from novel to familiar?  (That is, the paper assumes that, for the cell shown above, all stimuli that initially elicit more than 30 sp/s response will be strengthened, while all stimuli below 30 sp/s will be weakened). This is an empirical claim that could be checked experimentally with longer recordings. If there’s significant jumbling in the rank ordering of responses over the course of exposure, then we can infer something more complicated is going on. So, lots of interesting follow-up questions to ask!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s