Exact solutions to the nonlinear dynamics of learning in deep linear neural networks

This week in lab meeting we discussed:

Exact solutions to the nonlinear dynamics of learning in deep linear neural networks Andrew M. Saxe, James L. McClelland, Surya Ganguli. arxiv (2013).

This work aims to start analyzing a gnawing question in machine learning: How do deep neural networks actually work? Continue reading

Deep Exponential Families

In this week’s lab meeting, I presented:

Deep Exponential Families
Rajesh Ranganath, Linpeng Tang, Laurent Charlin and David Blei.
http://arxiv.org/abs/1411.2581

This paper describes a class of latent variable models inspired by deep neural net and hierarchical generative model, called Deep Exponential Families (DEFs). DEFs stack multiple layers of exponential families and connects them with certain link functions to capture the hierarchy of dependencies.

Continue reading

Every Neuron is Special

A couple of weeks ago I presented

A category-free neural population supports evolving demands during decision-making

by David Raposo, Matthew Kaufman and Anne Churchland.  By “categories” they are referring to some population of cells whose responses during an experiment seem to be dominated by one or two of the experimental variables. The authors refer to these types of categories as functional categories.

Continue reading

Fast Kronecker trick for Gaussian Process regression with “expressive” kernels

On May 11th, I presented the following paper in lab meeting:

Fast Kernel Learning for Multidimensional Pattern Extrapolation
Andrew Gordon Wilson, Elad Gilboa, Arye Nehorai and John P. Cunningham

This paper presents a method for scaling up structured spectral mixture (SM) kernels (from Wilson et al 2013) for Gaussian Process regression to multi-dimensional settings in which many (but not all) of the input points live on a grid (e.g., a 2D image in which some pixels are missing).  The spectral mixture is a very expressive kernel that enhances representation learning, but the problem comes in applying it to large scale data.

Continue reading

Integration dynamics and choice probabilities

Recently in lab meeting, I presented

Sensory integration dynamics in a hierarchical network explains choice probabilities in cortical area MT
Klaus Wimmer, Albert Compte, Alex Roxin, Diogo Peixoto, Alfonso Renart & Jaime de la Rocha. Nature Communications, 2015

Wimmer et al. reanalyze and reinterpret a classic dataset of neural recordings from MT while monkeys perform a motion discrimination task. The classic result shows that the firing rates of neurons in MT are correlated with the monkey’s choice, even when the stimulus is the same. This covariation of neural activity and choice, termed choice probability, could indicate sensory variability causing behavioral variability or it could result from top-down signals that reflect the monkey’s choice.

Continue reading

Bayesian time-frequency representations

This week in lab meeting I presented

Robust spectrotemporal decomposition by iteratively reweighted least squares
Demba Ba, Behtash Babadi, Patrick L Purdon, and Emery N Brown. PNAS, 2014

BaFig1

In this paper, the authors proposed an algorithm for fast, robust estimation of a time-frequency representation.  The robustness properties were achieved by applying a group-sparsity prior across frequencies and a “fused LASSO” prior over time. However, the real innovation that they were able leverage was from an earlier paper, in which the authors proved that the MAP estimation problem could be solved by iteratively re-weighted least squares (IRLS), which turns out to be a version of the EM algorithm.

Continue reading