Inferring synaptic plasticity rules from spike counts

In last week’s computational & theoretical neuroscience journal club I presented the following paper from Nicolas Brunel’s group:

Inferring learning rules from distributions of firing rates in cortical neurons.
Lim, McKee, Woloszyn, Amit, Freedman, Sheinberg, & Brunel.
Nature Neuroscience (2015).

The paper seeks to explain experience-dependent changes in IT cortical responses in terms of an underlying synaptic plasticity rule. A basic observation about IT is that novel stimuli tend to elicit a moderate distribution of spike counts, while familiar stimuli elicit more extreme responses: most stimuli elicit a weaker response but a few stimuli elicit a more vigorous response. (In excitatory neurons at least!). In other words, IT responses become more selective as stimuli become more familiar. Here’s a plot illustrating the phenomenon for an example IT neuron, showing marginal response distributions to a collection of novel and familiar stimuli — note that the blue curve (familiar stimuli) is mostly shifted to the left of the red curve (novel stimuli), except for a small blip at the right edge, indicating a handful of familiar stimuli to which the cell responds a lot:

Why does this happen? Lim et al seek to explain it in terms of changes in recurrent synaptic weights within IT. The basic setup is the beloved recurrent nonlinear rate model:

$r_i = \Phi(h_i)$                           (spike rate of i‘th neuron)
$h_i = I_{iX} + \sum_j W_{ij} r_j$          (input to i‘th neuron)

where we have:

$\Phi(\cdot)$  — transfer function, or “input-output nonlinearity”
$I_{iX}$ —  feedforward input to neuron i from stimulus $X$.
$W_{ij}$ — synaptic weight connecting neuron j to neuron i.

Lim et al derive update rules for $W_{ij}$ from the measured spike count distributions $P_{novel}(R)$ and $P_{familiar}(R)$ (i.e., the curves in the figure above). If you understand how to transform probability densities into each other then you know all the math you need to understand how this works.

The basic fact we need from probability is the change-of-variables formula for transforming one probability density into another.  If I hand you a density $f_R(R)$ and a monotonically increasing function $\Phi(\cdot)$, then the density governing $h$, the input to $\Phi(\cdot)$, such that the output $R = \Phi(h)$ has desired density $f_R$ is given by:

$f_h(h) = f_R(\Phi(h)) \Phi' (h)$

Lim et al exploit this fact to infer the distribution of inputs for novel and familiar stimuli, $P_{novel}(h)$ and $P_{familiar}(h)$, given $\Phi(\cdot)$ and measured $P_{novel}(R)$ and $P_{familiar}(R)$. Then, they take the difference of the inferred distributions, $P_{novel}(h) - P_{familiar}(h)$,  to obtain  the change in the amount of input to the neuron for each $h$ value, which can be interpreted as a change in input for each $R$ value (because $R = \Phi(h)$.  Plotting $P_{novel}(h) - P_{familiar}(h)$ as a function of $R$ gives them the following curve (for this same example neuron):

This can be interpreted as a post-synaptic plasticity rule $f_{post}(R)$ which tells us how to change the weights as a function of the post-synaptic neuron’s firing rate.  The derivation requires a little bit of math (and an assumption that the full plasticity rule is separable, i.e., can be written as a product of functions that depend on the pre- and post-spike rates:  $\Delta W_{ij} = f_{post}(R_i) f_{pre}(R_j)$).

But the intuition should be easy enough to get even without the math: it’s simply that, for this neuron, if your distribution of firing rates for novel stimuli is the red curve (top figure), and your goal is to transform it into the blue one as this set of stimuli becomes familiar, do this: for any stimulus that gives you less than 30 sp/s, weaken the synapses (by some small amount that depends on the presynaptic rate of the inputs); for any stimulus that gives you more than 30 sp/s, strengthen the active synapses. Thus, stimuli that initially drove you at >30 sp/s will end up driving you even more;  stimuli that initially drive you at <30 sp/s will eventually  drive you even less. This should eventually produce the blue curve from the red one, right?

(Hmm, I admit I don’t fully get why it the response distribution should stabilize and stop changing once it gets to the  blue curve — wouldn’t it keep changing the response distribution to become even more extreme? I’m not entirely sure what prevents that. If the paper gave an explanation then I’m afraid I missed it. Or maybe it doesn’t stop! I guess it’s an empirical question what happens as stimuli go from familiar to super familiar.)

The paper estimates the plasticity rule for every neuron, and shows that the majority of excitatory neurons have plasticity rules that look like the one shown above, but that (putatively) inhibitory neurons have purely negative plasticity rules.  (This accords with the fact that the firing rate distributions for inhibitory neurons shift entirely toward negative rates, with no increasing blip on the right tail). They also show that you can simulate a population of neurons with these plasticity rules, and the simulated population will behave just like the real neurons (i.e., in terms of the effects on the response distribution for familiar vs. novel stimuli).

Overall, it’s a very nice paper that takes some very simple math (with a simple intuition behind it) and uses it to derive a connection between stuff we can measure extracellularly and a theory / prediction about underlying biophysical mechanisms.

• Technically, the paper didn’t do quite what I described above. Instead of assuming a fixed nonlinearity $\Phi(\cdot)$ mapping input $h$ to $R$, they made an assumption that the distribution of inputs from novel stimuli $P_{novel}(h)$ was Gaussian, and used that to estimate $\Phi(\cdot)$ for each neuron.  Then they used the estimated $\Phi(\cdot)$ to compute $P_{familiar}(h)$ from the measured $P_{familiar}(R)$. But it’s the same basic idea. Basically, the three objects $P(h)$, $\Phi(\cdot)$, and $P(R)$ are connected.  If I tell you any two of them you can infer the third.  Personally it seems a bit strange to me that the authors decided to take this approach: it assumes something about the distribution of synaptic input currents for novel stimuli, and uses this to derive a unique nonlinearity for each neuron. But I would have thought it more natural to assume that all neurons common nonlinearity (after all, we have some theory about what these nonlinearities should look like for various spike models, e.g., the “escape rate approximation” from Plesser & Gerstner 2000 or more recently, Aviel & Gerstner 2006), and use those to infer the input distributions for both sets of stimuli, rather than assuming we know one of them a priori. I don’t regard this as crucial — the supplement shows that the results don’t change that much qualitatively if you assume different distributions for $P_{novel}(h)$.