Will multivariate decoding spell the end of simulation theory?

Decoding techniques such as multivariate pattern analysis (MVPA) are hot stuff in cognitive neuroscience, largely because they offer a tentative promise of actually reading out the underlying computations in a region rather than merely describing data features (e.g. mean activation profiles). While I am quite new to MVPA and similar machine learning techniques (so please excuse any errors in what follows), the basic process has been explained to me as a reversal of the X and Y variables in a typical general linear model. Instead of specifying a design matrix of explanatory (X) variables and testing how well those predict a single independent (Y) variable (e.g. the BOLD timeseries in each voxel), you try to estimate an explanatory variable (essentially decoding the ‘design matrix’ that produced the observed data) from many Y variables, for example one Y variable per voxel (hence the multivariate part). The decoded explanatory variable then describes (BOLD) responses in way that can vary in space, rather than reflecting an overall data feature across a set of voxels such as mean or slope. Typically decoding analyses proceed in two steps, one in which you train the classifier on some set of voxels and another where you see how well that trained model can classify patterns of activity in another scan or task. It is precisely this ability to detect patterns in subtle spatial variations that makes MVPA an attractive technique- the GLM simply doesn’t account for such variation.

The implicit assumption here is that by modeling subtle spatial variations across a set of voxels, you can actually pick up the neural correlates of the underlying computation or representation (Weil and Rees, 2010, Poldrack, 2011). To illustrate the difference between an MVPA and GLM analysis, imagine a classical fMRI experiment where we have some set of voxels defining a region with a significant mean response to your experimental manipulation. All the GLM can tell us is that in each voxel the mean response is significantly different from zero. Each voxel within the significant region is likely to vary slightly in its actual response- you might imagine all sorts of subtle intensity variations within a significant region- but the GLM essentially ignores this variation. The exciting assumption driving interest in decoding is that this variability might actually reflect the activity of sub-populations of neurons and by extension, actual neural representations. MVPA and similar techniques are designed to pick out when these reflect a coherent pattern; once identified this pattern can be used to “predict” when the subject was seeing one or another particular stimulus. While it isn’t entirely straightforward to interpret the patterns MVPA picks out as actual ‘neural representations’, there is some evidence that the decoded models reflect a finer granularity of neural sub-populations than represented in overall mean activation profiles (Todd, 2013; Thompson 2011).

Professor Xavier applies his innate talent for MVPA.
Professor Xavier applies his innate talent for MVPA.

As you might imagine this is terribly exciting, as it presents the possibility to actually ‘read-out’ the online function of some brain area rather than merely describing its overall activity. Since the inception of brain scanning this has been exactly the (largely failed) promise of imaging- reverse inference from neural data to actual cognitive/perceptual contents. It is understandable then that decoding papers are the ones most likely to appear in high impact journals- just recently we’ve seen MVPA applied to dream states, reconstruction of visual experience, and pain experience all in top journals (Kay et al., 2008, Horikawa et al., 2013, Wager et al., 2013). I’d like to focus on that last one for the remainer of this post, as I think we might draw some wide-reaching conclusions for theoretical neuroscience as a whole from Wager et al’s findings.

Francesca and I were discussing the paper this morning- she’s working on a commentary for a theoretical paper concerning the role of the “pain matrix” in empathy-for-pain research. For those of you not familiar with this area, the idea is a basic simulation-theory argument-from-isomorphism. Simulation theory (ST) is just the (in)famous idea that we use our own motor system (e.g. mirror neurons) to understand the gestures of others. In a now infamous experiment Rizzolatti et al showed that motor neurons in the macaque monkey responded equally to their own gestures or the gestures of an observed other (Rizzolatti and Craighero, 2004). They argued that this structural isomorphism might represent a general neural mechanism such that social-cognitive functions can be accomplished by simply applying our own neural apparatus to work out what was going on for the external entity. With respect to phenomena such empathy for pain and ‘social pain’ (e.g. viewing a picture of someone you broke up with recently), this idea has been extended to suggest that, since a region of networks known as “the pain matrix” activates similarly when we are in pain or experience ‘social pain’, that we “really feel” pain during these states (Kross et al., 2011) [1].

In her upcoming commentary, Francesca points out an interesting finding in the paper by Wager and colleagues that I had overlooked. Wager et al apply a decoding technique in subjects undergoing painful and non-painful stimulation. Quite impressively they are then able to show that the decoded model predicts pain intensity in different scanners and various experimental manipulations. However they note that the model does not accurately predict subject’s ‘social pain’ intensity, even though the subjects did activate a similar network of regions in both the physical and social pain tasks (see image below). One conclusion from these findings it that it is surely premature to conclude that because a group of subjects may activate the same regions during two related tasks, those isomorphic activations actually represent identical neural computations [2]. In other words, arguments from structural isomorpism like ST don’t provide any actual evidence for the mechanisms they presuppose.

Figure from Wager et al demonstrating specificity of classifier for pain vs warmth and pain vs rejection. Note poor receiver operating curve (ROC) for 'social pain' (rejecter vs friend), although that contrast picks out similar regions of the 'pain matrix'.
Figure from Wager et al demonstrating specificity of classifier for pain vs warmth and pain vs rejection. Note poor receiver operating curve (ROC) for ‘social pain’ (rejecter vs friend), although that contrast picks out similar regions of the ‘pain matrix’.

To me this is exactly the right conclusion to take from Wager et al and similar decoding papers. To the extent that the assumption that MVPA identifies patterns corresponding to actual neural representations holds, we are rapidly coming to realize that a mere mean activation profile tells us relatively little about the underlying neural computations [3]. It certainly does not tell us enough to conclude much of anything on the basis that a group of subjects activate “the same brain region” for two different tasks. It is possible and even likely that just because I activate my motor cortex when viewing you move, I’m doing something quite different with those neurons than when I actually move about. And perhaps this was always the problem with simulation theory- it tries to make the leap from description (“similar brain regions activate for X and Y”) to mechanism, without actually describing a mechanism at all. I guess you could argue that this is really just a much fancier argument against reverse inference and that we don’t need MVPA to do away with simulation theory. I’m not so sure however- ST remains a strong force in a variety of domains. If decoding can actually do away with ST and arguments from isomorphism or better still, provide a reasonable mechanism for simulation, it’ll be a great day in neuroscience. One thing is clear- model based approaches will continue to improve cognitive neuroscience as we go beyond describing what brain regions activate during a task to actually explaining how those regions work together to produce behavior.

I’ve curated some enlightening responses to this post in a follow-up – worth checking for important clarifications and extensions! See also the comments on this post for a detailed explanation of MVPA techniques. 

References

Horikawa T, Tamaki M, Miyawaki Y, Kamitani Y (2013) Neural Decoding of Visual Imagery During Sleep. Science.

Kay KN, Naselaris T, Prenger RJ, Gallant JL (2008) Identifying natural images from human brain activity. Nature 452:352-355.

Kross E, Berman MG, Mischel W, Smith EE, Wager TD (2011) Social rejection shares somatosensory representations with physical pain. Proceedings of the National Academy of Sciences 108:6270-6275.

Poldrack RA (2011) Inferring mental states from neuroimaging data: from reverse inference to large-scale decoding. Neuron 72:692-697.

Rizzolatti G, Craighero L (2004) The mirror-neuron system. Annu Rev Neurosci 27:169-192.

Thompson R, Correia M, Cusack R (2011) Vascular contributions to pattern analysis: Comparing gradient and spin echo fMRI at 3T. Neuroimage 56:643-650.

Todd MT, Nystrom LE, Cohen JD (2013) Confounds in Multivariate Pattern Analysis: Theory and Rule Representation Case Study. NeuroImage.

Wager TD, Atlas LY, Lindquist MA, Roy M, Woo C-W, Kross E (2013) An fMRI-Based Neurologic Signature of Physical Pain. New England Journal of Medicine 368:1388-1397.

Weil RS, Rees G (2010) Decoding the neural correlates of consciousness. Current opinion in neurology 23:649-655.


[1] Interestingly this paper comes from the same group (Wager et al) showing that pain matrix activations do NOT predict ‘social’ pain. It will be interesting to see how they integrate this difference.

[2] Nevermind the fact that the ’pain matrix’ is not specific for pain.

[3] With all appropriate caveats regarding the ability of decoding techniques to resolve actual representations rather than confounding individual differences (Todd et al., 2013) or complex neurovascular couplings (Thompson et al., 2011).