A Needle in the Connectome: Neural ‘Fingerprint’ Identifies Individuals with ~93% accuracy

Much like we picture ourselves, we tend to assume that each individual brain is a bit of a unique snowflake. When running a brain imaging experiment it is common for participants or students to excitedly ask what can be revealed specifically about them given their data. Usually, we have to give a disappointing answer – not all that much, as neuroscientists typically throw this information away to get at average activation profiles set in ‘standard’ space. Now a new study published today in Nature Neuroscience suggests that our brains do indeed contain a kind of person-specific fingerprint, hidden within the functional connectome. Perhaps even more interesting, the study suggests that particular neural networks (e.g. frontoparietal and default mode) contribute the greatest amount of unique information to your ‘neuro-profile’ and also predict individual differences in fluid intelligence.

To do so lead author Emily Finn and colleagues at Yale University analysed repeated sets of functional magnetic resonance imaging (fMRI) data from 128 subjects over 6 different sessions (2 rest, 4 task), derived from the Human Connectome Project. After dividing each participant’s brain data into 268 nodes (a technique known as “parcellation”), Emily and colleagues constructed matrices of the pairwise correlation between all nodes. These correlation matrices (below, figure 1b), which encode the connectome or connectivity map for each participant, were then used in a permutation based decoding procedure to determine how accurately a participant’s connectivity pattern could be identified from the rest. This involved taking a vector of edge values (connection strengths) from a participant in the training set and correlating it with a similar vector sampled randomly with replacement from the test set (i.e. testing whether one participant’s data correlated with another’s). Pairs with the highest correlation where then labelled “1” to indicate that the algorithm assigned a matching identity between a particular train-test pair. The results of this process were then compared to a similar one in which both pairs and subject identity were randomly permuted.

Finn et al's method for identifying subjects from their connectomes.
Finn et al’s method for identifying subjects from their connectomes.

At first glance, the results are impressive:

Identification was performed using the whole-brain connectivity matrix (268 nodes; 35,778 edges), with no a priori network definitions. The success rate was 117/126 (92.9%) and 119/126 (94.4%) based on a target-database of Rest1-Rest2 and the reverse Rest2-Rest1, respectively. The success rate ranged from 68/126 (54.0%) to 110/126 (87.3%) with other database and target pairs, including rest-to-task and task-to-task comparisons.

This is a striking result – not only could identity be decoded from one resting state scan to another, but the identification also worked when going from rest to a variety of tasks and vice versa. Although classification accuracy dropped when moving between different tasks, these results were still highly significant when compared to the random shuffle, which only achieved a 5% success rate. Overall this suggests that inter-individual patterns in connectivity are highly reproducible regardless of the context from which they are obtained.

The authors then go on to perform a variety of crucial control analyses. For example, one immediate worry is that that the high identification might be driven by head motion, which strongly influences functional connectivity and is likely to show strong within-subject correlation. Another concern might be that the accuracy is driven primarily by anatomical rather than functional features. The authors test both of these alternative hypotheses, first by applying the same decoding approach to an expanded set of root-mean square motion parameters and second by testing if classification accuracy decreased as the data were increasingly smoothed (which should eliminate or reduce the contribution of anatomical features). Here the results were also encouraging: motion was totally unable to predict identity, resulting in less than 5% accuracy, and classification accuracy remained essentially the same across smoothing kernels. The authors further tested the contribution of their parcellation scheme to the more common and coarse-grained Yeo 8-network solution. This revealed that the coarser network division seemed to decrease accuracy, particularly for the fronto-parietal network, a decrease that was seemingly driven by increased reliability of the diagonal elements of the inter-subject matrix (which encode the intra-subject correlation). The authors suggest this may reflect the need for higher spatial precision to delineate individual patterns of fronto-parietal connectivity. Although this intepretation seems sensible, I do have to wonder if it conflicts with their smoothing-based control analysis. The authors also looked at how well they could identify an individual based on the variability of the BOLD signal in each region and found that although this was also significant, it showed a systematic decrease in accuracy compared to the connectomic approach. This suggests that although at least some of what makes an individual unique can be found in activity alone, connectivity data is needed for a more complete fingerprint. In a final control analysis (figure 2c below), training simultaneously on multiple data sets (for example a resting state and a task, to control inherent differences in signal length) further increased accuracy to as high as 100% in some cases.

Finn et al; networks showing most and least individuality and contributing factors.
Finn et al; networks showing most and least individuality and contributing factors. Interesting to note that sensory areas are highly common across subjects whereas fronto-parietal and mid-line show the greatest individuality!

Having established the robustness of their connectome fingerprints, Finn and colleagues then examined how much each individual cortical node contributed to the identification accuracy. This analysis revealed a particularly interesting result; frontal-parietal and midline (‘default mode’) networks showed the highest contribution (above, figure 2a), whereas sensory areas appeared to not contribute at all. This compliments their finding that the more coarse grained Yeo parcellation greatly reduced the contribution of these networks to classificaiton accuracy. Further still, Finn and colleagues linked the contributions of these networks to behavior, examining how strongly each network fingerprint predicted an overall index of fluid intelligence (g-factor). Again they found that fronto-parietal and default mode nodes were the most predictive of inter-individual differences in behaviour (in opposite directions, although I’d hesitate to interpret the sign of this finding given the global signal regression).

So what does this all mean? For starters this is a powerful demonstration of the rich individual information that can be gleaned from combining connectome analyses with high-volume data collection. The authors not only showed that resting state networks are highly stable and individual within subjects, but that these signatures can be used to delineate the way the brain responds to tasks and even behaviour. Not only is the study well powered, but the authors clearly worked hard to generalize their results across a variety of datasets while controlling for quite a few important confounds. While previous studies have reported similar findings in structural and functional data, I’m not aware of any this generalisable or specific. The task-rest signature alone confirms that both measures reflect a common neural architecture, an important finding. I could be a little concerned about other vasculature or breath-related confounds; the authors do remove such nuisance variables though, so this may not be a serious concern (though I am am not convinced their use of global signal regression to control these variables is adequate). These minor concerns none-withstanding, I found the network-specific results particularly interesting; although previous studies indicate that functional and structural heterogeneity greatly increases along the fronto-parietal axis, this study is the first demonstration to my knowledge of the extremely high predictive power embedded within those differences. It is interesting to wonder how much of this stability is important for the higher-order functions supported by these networks – indeed it seems intuitive that self-awareness, social cognition, and cognitive control depend upon acquired experiences that are highly individual. The authors conclude by suggesting that future studies may evaluate classification accuracy within an individual over many time points, raising the interesting question: Can you identify who I am tomorrow by how my brain connects today? Or am I “here today, gone tomorrow”?

Only time (and connectomics) may tell…


 

edit:

thanks to Kate Mills for pointing out this interesting PLOS ONE paper from a year ago (cited by Finn et al), that used similar methods and also found high classification accuracy, albeit with a smaller sample and fewer controls:

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0111048

 

edit2:

It seems there was a slight mistake in my understanding of the methods – see this useful comment by lead author Emily Finn for clarification:

http://neuroconscience.com/2015/10/12/a-needle-in-the-connectome-neural-fingerprint-identifies-individuals-with-93-accuracy/#comment-36506


corrections? comments? want to yell at me for being dumb? Let me know in the comments or on twitter @neuroconscience!

UPDATED WITH ANSWERS – summary of the major questions [and answers] asked at #LSEbrain about the Bayesian Brain Hypothesis

ok here are the answers! meant to release them last night but was a bit delayed by sleep 🙂

OK it is about 10pm here and I’ve got an HBM abstract to submit but given that the LSE wasn’t able to share the podcast, i’m just going to quickly summarize some of the major questions brought up either by the speakers or audience during the event.

For those that don’t know, the LSE hosted a brief event tonight exploring the question: “is the brain a predictive machine”, with panelists Paul Fletcher, Karl Friston, Demis Hassabis, Richard Holton and chaired by Benedetto De Martino. I enjoyed the event as it was about the right length and the discussion was lively. For those familiar with Bayesian brain/predictive coding/FEP there wasn’t much new information, but it was cool to see an outside audience react.

These were the principle questions that came up in the course of the event. Keep in mind these are just reproduced from my (fallible) memory:

  • What does it mean if someone acts, thinks, or otherwise behaves irrationally/non-optimally. Can their brain still be Bayesian at a sub-personal level?
    • There were a variety of answers to this question, with the most basic being that optimal behavior depends on ones prior, so someone with a mental disorder or poor behavior may be acting optimally with respect to their priors. Karl pointed out that that this means optimal behavior really is different for every organism and person, rendering the notion of optimal trivial.
  • Instead of changing the model, is it possible for the brain to change the world so it fits with our model of it?
    • Yes, Karl calls this active inference and it is a central part of his formulation of the Bayesian brain. Active inference allows you to either re-sample or adjust the world such that it fits with your model, and brings in a kind of strong embodiment to the Bayesian brain. This is because the kinds of actions  (and perceptions) one can engage in are shaped by the body and internal states,
  • Where do the priors come from?
    • Again the answer from Karl – evolution. According to the FEP, organisms who survive do so in virtue of their ability to minimize free energy (prediction error). This means that for Karl evolution ‘just is the refinement and inheritance of our models of the world’; our brains reflect the structure of the world which is then passed on through natural selection and epigenetic mechanisms.
  • Is the theory falsifiable and if so, what kind of data would disprove it?
    • From Karl – ‘No. The theory is not falsifiable in the same sense that Natural Selection is not falsifiable’. At this there were some roars from the crowd and philosopher Richard Holton was asked how he felt about this statement. Richard said he would be very hesitant to endorse a theory that claimed to be non-falsifiable.
  • Is it possible for the brain to ‘over-fit’ the world/sensory data?
    • Yes, from Paul we heard that this is a good description of what happens in psychotic or other mental disorders, where an overly precise belief might resist any attempts to dislodge it or evidence to the contrary. This lead back into more discussion of what it means for an organism to behave in a way that is not ‘objectively optimal’.
  • If we could make a Bayesian deep learning machine would it be conscious, and if so what rights should we give it?
    • I didn’t quite catch Demis response to this as it was quite quick and there was a general laugh about these types of questions coming up.
  • How exactly is the brain Bayesian? Does it follow a predictive coding, approximate, or variational Bayesian implementation?
    • Here there was some interesting discussion from all sides, with Karl saying it may actually be a combination of these methods or via approximations we don’t yet understand. There was a lot of discussion about why Deep Brain doesn’t implement a Bayesian scheme in their networks, and it was revealed that it is because hierarchical Bayesian inference is currently too computationally demanding for such applications. Karl picked up on this point to say that the same is true of the human brain; the FEP outlines some general principles but we are still far from understanding how the brain actually approximates Bayesian inference.
  • Can conscious beliefs, or decisions in the way we typically think of them, be thought of in a probabilistic way?’
    • Karl: ‘Yes’
    • Holton: Less sure
    • Panel: this may call for multiple models, binary vs discrete, etc
    • Karl redux: isn’t it interesting how now we are increasingly reshaping the world to better model our predictions, i.e. using external tools in place of memory, navigation, planning, etc (i.e. extended cognition)

There were other small bits of discussion, particularly concerning what it means for an agent to be optimal or not, and the relation of explicit/conscious states to a subpersonal Bayesian brain, but I’m afraid I can’t recall them in enough detail to accurately report them. Overall the discussion was interesting and lively, and I presume there will be some strong opinions about some of these. There was also a nice moment where Karl repeatedly said that the future of neuroscience was extended and enactive cognition. Some of the discussion between the panelist was quite interesting, particularly Paul’s views on mental disorders and Demis talking about why the brain might engage in long-term predictions and imagination (because collecting real data is expensive/dangerous).

Please write in the comments if I missed anything. I’d love to hear what everyone thinks about these. I’ve got my opinions particularly about the falsification question, but I’ll let others discuss before stating them.

What’s the causal link dissociating insula responses to salience and bodily arousal?

Just reading this new paper by Lucina Uddin and felt like a quick post. It is a nice review of one of my favorite brain networks, the ever present insular cortex and ‘salience network’ (thalamus, AIC, MCC). As we all know AIC activation is one of the most ubiquitous in our field and generally shows up in everything. Uddin advances the well-supported idea that in addition to being sensitive to visceral, autonomic, bodily states (and also having a causal influence on them), the network responds generally to salient stimuli (like oddballs) across all sensory modalities. We already knew this but a thought leaped to my mind; what is the order of causation here? If the AIC responds to and causes arousal spikes, are oddball responses driven by the novelty of the stimuli or by a first order evoked response in the body? Your brainstem, spinal cord, and PNS are fully capable of creating visceral responses to unexpected stimuli. How can we dissociate ‘dry’ oddball responses from evoked physiological responses? It seems likely that arousal spikes accompany anything unexpected and that salience itself doesn’t really dissociate AIC responses from a more general role of bodily awareness. Recent studies show that oddballs evoke pupil dilation, which is related to arousal.

Check out this figure:

fig1

Clearly AIC and ACC not only receive physiological input but also can directly cause phsyio outputs. I’m immediately reminded of an excellent review by Markus Ullsperger and colleagues, where they run into a similar issue trying to work out how arousal cues contribute to conscious error awareness. Ultimately Ullsperger et al conclude that we can’t really dissociate whether arousal cues cause error awareness or error-awareness causes arousal spikes. This seems to also be true for a general salience account.

ulls

How can we tease these apart? It seems like we’d need to somehow both knock out and cause physiological responses during the presence and absence of salient stimuli. I’m not sure how we could do this – maybe de-afferentiated patients could get us part of the way there. But a larger problem looms also: the majority of findings cited by Uddin (and to a lesser extent Ullsperger) come from fMRI. Indeed, the original Seeley et al “salience network” paper (one of the top 10 most cited papers in neuroscience) and the original Critchley insula-interoception papers (also a top ten paper) is based on fMRI. Given that these areas are also heavily contaminated by pulse and respiration artifacts, how can we work out the causal loop between salience/perception and arousal? If a salient cue causes a pulse spike then it might also cause a corresponding BOLD artifact. It might be that there is a particularly non-artefactual relationship between salient things and arousal but currently we can’t seem to work out the direction of causation. Worse, it is possible the process driving the artifacts themselves are crucial for ‘salience’ computation, which would mean physio-correction would obscure these important relationships! A tough cookie indeed. Lastly, we’ll need to go beyond the somewhat psychological label of ‘salience’ if we really want to work out these issues. For my money, I think an account based on expected precision fits nicely with the pattern of results we see in these areas, providing a computational mechanism for ‘salience’.

In the end I suspect this is going be one for the direct recording people to solve. If you’ve got access to insula implantees, let me know! 😀

Note: folks on twitter said they’d like to see more of the cuff posts – here you go! This post was written in a flurry of thought in about 30 minutes, so please excuse any snarfs! 

oh BOLD where art thou? Evidence for a “mm-scale” match between intracortical and fMRI measures.

A frequently discussed problem with functional magnetic resonance imaging is that we don’t really understand how the hemodynamic ‘activations’ measured by the technique relate to actual neuronal phenomenon. This is because fMRI measures the Blood-Oxygenation-Level Dependent (BOLD) signal, a complex vascular response to neuronal activity. As such, neuroscientists can easily get worried about all sorts of non-neural contributions to the BOLD signal, such as subjects gasping for air, pulse-related motion artefacts, and other generally uninteresting effects. We can even start to worry that out in the lab, the BOLD signal may not actually measure any particular aspect of neuronal activity, but rather some overly diluted, spatially unconstrained filter that simply lacks the key information for understanding brain processes.

Given that we generally use fMRI over neurophysiological methods (e.g. M/EEG) when we want to say something about the precise spatial generators of a cognitive process, addressing these ambiguities is of utmost importance. Accordingly a variety of recent papers have utilized multi-modal techniques, for example combining optogenetics, direct recordings, and FMRI, to assess particularly which kinds of neural events contribute to alterations in the BOLD signal and it’s spatial (mis)localization. Now a paper published today in Neuroimage addresses this question by combining high resolution 7-tesla fMRI with Electrocorticography (ECoG) to determine the spatial overlap of finger-specific somatomotor representations captured by the measures. Starting from the title’s claim that “BOLD matches neuronal activity at the mm-scale”, we can already be sure this paper will generate a great deal of interest.

From Siero et al (In Press)

As shown above, the authors managed to record high resolution (1.5mm) fMRI in 2 subjects implanted with 23 x 11mm intracranial electrode arrays during a simple finger-tapping task. Motor responses from each finger were recorded and used to generate somatotopic maps of brain responses specific to each finger. This analysis was repeated in both ECoG and fMRI, which were then spatially co-registered to one another so the authors could directly compare the spatial overlap between the two methods. What they found appears at first glance, to be quite impressive:
From Siero et al (In Press)

Here you can see the color-coded t-maps for the BOLD activations to each finger (top panel, A), the differential contrast contour maps for the ECOG (middle panel, B), and the maximum activation foci for both measures with respect to the electrode grid (bottom panel, C), in two individual subjects. Comparing the spatial maps for both the index and thumb suggests a rather strong consistency both in terms of the topology of each effect and the location of their foci. Interestingly the little finger measurements seem somewhat more displaced, although similar topographic features can be seen in both. Siero and colleagues further compute the spatial correlation (Spearman’s R) across measures for each individual finger, finding an average correlation of .54, with a range between .31-.81, a moderately high degree of overlap between the measures. Finally the optimal amount of shift needed to reduce spatial difference between the measures was computed and found to be between 1-3.1 millimetres, suggesting a slight systematic bias between ECoG and fMRI foci.

Are ‘We the BOLD’ ready to breakout the champagne and get back to scanning in comfort, spatial anxieties at ease? While this is certainly a promising result, suggesting that the BOLD signal indeed captures functionally relevant neuronal parameters with reasonable spatial accuracy, it should be noted that the result is based on a very-best-case scenario, and that a considerable degree of unique spatial variance remains for the two methods. The data presented by Siero and colleagues have undergone a number of crucial pre-processing steps that are likely to influence their results: the high degree of spatial resolution, the manual removal of draining veins, the restriction of their analysis to grey-matter voxels only, and the lack of spatial smoothing all render generalizing from these results to the standard 3-tesla whole brain pipeline difficult. Indeed, even under these best-case criteria, the results still indicate up to 3mm of systematic bias in the fMRI results. Though we can be glad the bias was systematic and not random– 3mm is still quite a lot in the brain. On this point, the authors note that the stability of the bias may point towards a systematic miss-registration of the ECoG and FMRI data and/or possible rigid-body deformations introduced by the implantation of the electrodes), issues that could be addressed in future studies. Ultimately it remains to be seen whether similar reliability can be obtained for less robust paradigms than finger wagging, obtained in the standard sub-optimal imaging scenarios. But for now I’m happy to let fMRI have its day in the sun, give or take a few millimeters.

Siero, J. C. W., Hermes, D., Hoogduin, H., Luijten, P. R., Ramsey, N. F., & Petridou, N. (2014). BOLD matches neuronal activity at the mm scale: A combined 7T fMRI and ECoG study in human sensorimotor cortex. NeuroImage. doi:10.1016/j.neuroimage.2014.07.002

 

Is the resting BOLD signal physiological noise? What about resting EEG?

Over the past 5 years, resting-state fMRI (rsfMRI) has exploded in popularity. Literally dozens of papers are published each day examining slow (< .1 hz) or “low frequency” fluctuations in the BOLD signal. When I first moved to Europe I was caught up in the somewhat North American frenzy of resting state networks. I couldn’t understand why my Danish colleagues, who specialize in modelling physiological noise in fMRI, simply did not take the literature seriously. The problem is essentially that the low frequencies examined in these studies are the same as those that dominate physiological rhythms. Respiration and cardiac pulsation can make up a massive amount of variability in the BOLD signal. Before resting state fMRI came along, nearly every fMRI study discarded any data frequencies lower than one oscillation every 120 seconds (e.g. 1/120 Hz high pass filtering). Simple things like breath holding and pulsatile motion in vasculature can cause huge effects in BOLD data, and it just so happens that these artifacts (which are non-neural in origin) tend to pool around some of our favorite “default” areas: medial prefrontal cortex, insula, and other large gyri near draining veins.

Naturally this leads us to ask if the “resting state networks” (RSNs) observed in such studies are actually neural in origin, or if they are simply the result of variations in breath pattern or the like. Obviously we can’t answer this question with fMRI alone. We can apply something like independent component analysis (ICA) and hope that it removes most of the noise- but we’ll never really be 100% sure we’ve gotten it all that way. We can measure the noise directly (e.g. “nuisance covariance regression”) and include it in our GLM- but much of the noise is likely to be highly correlated with the signal we want to observe. What we need are cross-modality validations that low-frequency oscillations do exist, that they drive observed BOLD fluctuations, and that these relationships hold even when controlling for non-neural signals. Some of this is already established- for example direct intracranial recordings do find slow oscillations in animal models. In MEG and EEG, it is well established that slow fluctuations exist and have a functional role.

So far so good. But what about in fMRI? Can we measure meaningful signal while controlling for these factors? This is currently a topic of intense research interest. Marcus Raichle, the ‘father’ of the default mode network, highlights fascinating multi-modal work from a Finnish group showing that slow fluctuations in behavior and EEG signal coincide (Raichle and Snyder 2007; Monto, Palva et al. 2008). However, we should still be cautious- I recently spoke to a post-doc from the Helsinki group about the original paper, and he stressed that slow EEG is just as contaminated by physiological artifacts as fMRI. Except that the problem is even worse, because in EEG the artifacts may be several orders of magnitude larger than the signal of interest[i].

Understandably I was interested to see a paper entitled “Correlated slow fluctuations in respiration, EEG, and BOLD fMRI” appear in Neuroimage today (Yuan, Zotev et al. 2013). The authors simultaneously collected EEG, respiration, pulse, and resting fMRI data in 9 subjects, and then perform cross-correlation and GLM analyses on the relationship of these variables, during both eyes closed and eyes open rest. They calculate Respiratory Volume per Time (RVT), a measure developed by Rasmus Birn, to assign a respiratory phase to each TR (Birn, Diamond et al. 2006). One key finding is that the global variations in EEG power are strongly predicted by RVT during eyes closed rest, with a maximum peak correlation coefficient of .40. Here are the two time series:

RVTalpha 

You can clearly see that there is a strong relationship between global alpha (GFP) and respiration (RVT). The authors state that “GFP appears to lead RVT” though I am not so sure. Regardless, there is a clear relationship between eyes closed ‘alpha’ and respiration. Interestingly they find that correlations between RVT and GFP with eyes open were not significantly different from chance, and that pulse did not correlate with GFP. They then conduct GLM analyses with RVT and GFP as BOLD regressors. Here is what their example subject looked like during eyes-closed rest:

RVT_GFP_BOLD

Notice any familiar “RSNs” in the RVT map? I see anti-correlated executive deactivation and default mode activation! Very canonical.  Too bad they are breath related. This is why noise regression experts tend to dislike rsfMRI, particularly when you don’t measure the noise. We also shouldn’t be too surprised that the GFP-BOLD and RVT-BOLD maps look similar, considering that GFP and RVT are highly correlated. After looking at these correlations separately, Yuan et al perform RETROICOR physiological noise correction and then reexamine the contrasts. Here are the group maps:

group_map

Things look a bit less default-mode-like in the group RVT map, but the RVT and GFP maps are still clearly quite similar. In panel D you can see that physiological noise correction has a large global impact on GFP-BOLD correlations, suggesting that quite a bit of this co-variance is driven by physiological noise. Put simply, respiration is explaining a large degree of alpha-BOLD correlation; any experiment not modelling this covariance is likely to produce strongly contaminated results. Yuan et al go on to examine eyes-open rest and show that, similar to their RVT-GFP cross-correlation analysis, not nearly as much seems to be happening in eyes open compared to closed:

eyesopen

The authors conclude that “In particular, this correlation between alpha EEG and respiration is much stronger in eyes-closed resting than in eyes-open resting” and that “[the] results also suggest that eyes-open resting may be a more favorable condition to conduct brain resting state fMRI and for functional connectivity analysis because of the suppressed correlation between low-frequency respiratory fluctuation and global alpha EEG power, therefore the low-frequency physiological noise predominantly of non-neuronal origin can be more safely removed.” Fair enough- one conclusion is certainly that eyes closed rest seems much more correlated with respiration than eyes open. This is a decent and useful result of the study. But then they go on to make this really strange statement, which appears in the abstract, introduction, and discussion:

“In addition, similar spatial patterns were observed between the correlation maps of BOLD with global alpha EEG power and respiration. Removal of respiration related physiological noise in the BOLD signal reduces the correlation between alpha EEG power and spontaneous BOLD signals measured at eyes-closed resting. These results suggest a mutual link of neuronal origin between the alpha EEG power, respiration, and BOLD signals”’ (emphasis added)

That’s one way to put it! The logic here is that since alpha = neural activity, and respiration correlates with alpha, then alpha must be the neural correlate of respiration. I’m sorry guys, you did a decent experiment, but I’m afraid you’ve gotten this one wrong. There is absolutely nothing that implies alpha power cannot also be contaminated by respiration-related physiological noise. In fact it is exactly the opposite- in the low frequencies observed by Yuan et al the EEG data is particularly likely to be contaminated by physiological artifacts! And that is precisely what the paper shows – in the author’s own words: “impressively strong correlations between global alpha and respiration”. This is further corroborated by the strong similarity between the RVT-BOLD and alpha-BOLD maps, and the fact that removing respiratory and pulse variance drastically alters the alpha-BOLD correlations!

So what should we take away from this study? It is of course inconclusive- there are several aspects of the methodology that are puzzling to me, and sadly the study is rather under-powered at n = 9. I found it quite curious that in each of the BOLD-alpha maps there seemed to be a significant artifact in the lateral and posterior ventricles, even after physiological noise correction (check out figure 2b, an almost perfect ventricle map). If their global alpha signal is specific to a neural origin, why does this artifact remain even after physiological noise correction? I can’t quite put my finger on it, but it seems likely to me that some source of noise remained even after correction- perhaps a reader with more experience in EEG-fMRI methods can comment. For one thing their EEG motion correction seems a bit suspect, as they simply drop outlier timepoints. One way or another, I believe we should take one clear message away from this study – low frequency signals are not easily untangled from physiological noise, even in electrophysiology. This isn’t a damnation of all resting state research- rather it is a clear sign that we need be to measuring these signals to retain a degree of control over our data, particularly when we have the least control at all.

References:

Birn, R. M., J. B. Diamond, et al. (2006). “Separating respiratory-variation-related fluctuations from neuronal-activity-related fluctuations in fMRI.” Neuroimage 31(4): 1536-1548.

Monto, S., S. Palva, et al. (2008). “Very slow EEG fluctuations predict the dynamics of stimulus detection and oscillation amplitudes in humans.” The Journal of Neuroscience 28(33): 8268-8272.

Raichle, M. E. and A. Z. Snyder (2007). “A default mode of brain function: a brief history of an evolving idea.” Neuroimage 37(4): 1083-1090.

Yuan, H., V. Zotev, et al. (2013). “Correlated Slow Fluctuations in Respiration, EEG, and BOLD fMRI.” NeuroImage pp. 1053-8119.

 


[i] Note that this is not meant to be in anyway a comprehensive review. A quick literature search suggests that there are quite a few recent papers on resting BOLD EEG. I recall a well done paper by a group at the Max Planck Institute that did include noise regressors, and found unique slow BOLD-EEG relations. I cannot seem to find it at the moment however!

 

Will multivariate decoding spell the end of simulation theory?

Decoding techniques such as multivariate pattern analysis (MVPA) are hot stuff in cognitive neuroscience, largely because they offer a tentative promise of actually reading out the underlying computations in a region rather than merely describing data features (e.g. mean activation profiles). While I am quite new to MVPA and similar machine learning techniques (so please excuse any errors in what follows), the basic process has been explained to me as a reversal of the X and Y variables in a typical general linear model. Instead of specifying a design matrix of explanatory (X) variables and testing how well those predict a single independent (Y) variable (e.g. the BOLD timeseries in each voxel), you try to estimate an explanatory variable (essentially decoding the ‘design matrix’ that produced the observed data) from many Y variables, for example one Y variable per voxel (hence the multivariate part). The decoded explanatory variable then describes (BOLD) responses in way that can vary in space, rather than reflecting an overall data feature across a set of voxels such as mean or slope. Typically decoding analyses proceed in two steps, one in which you train the classifier on some set of voxels and another where you see how well that trained model can classify patterns of activity in another scan or task. It is precisely this ability to detect patterns in subtle spatial variations that makes MVPA an attractive technique- the GLM simply doesn’t account for such variation.

The implicit assumption here is that by modeling subtle spatial variations across a set of voxels, you can actually pick up the neural correlates of the underlying computation or representation (Weil and Rees, 2010, Poldrack, 2011). To illustrate the difference between an MVPA and GLM analysis, imagine a classical fMRI experiment where we have some set of voxels defining a region with a significant mean response to your experimental manipulation. All the GLM can tell us is that in each voxel the mean response is significantly different from zero. Each voxel within the significant region is likely to vary slightly in its actual response- you might imagine all sorts of subtle intensity variations within a significant region- but the GLM essentially ignores this variation. The exciting assumption driving interest in decoding is that this variability might actually reflect the activity of sub-populations of neurons and by extension, actual neural representations. MVPA and similar techniques are designed to pick out when these reflect a coherent pattern; once identified this pattern can be used to “predict” when the subject was seeing one or another particular stimulus. While it isn’t entirely straightforward to interpret the patterns MVPA picks out as actual ‘neural representations’, there is some evidence that the decoded models reflect a finer granularity of neural sub-populations than represented in overall mean activation profiles (Todd, 2013; Thompson 2011).

Professor Xavier applies his innate talent for MVPA.
Professor Xavier applies his innate talent for MVPA.

As you might imagine this is terribly exciting, as it presents the possibility to actually ‘read-out’ the online function of some brain area rather than merely describing its overall activity. Since the inception of brain scanning this has been exactly the (largely failed) promise of imaging- reverse inference from neural data to actual cognitive/perceptual contents. It is understandable then that decoding papers are the ones most likely to appear in high impact journals- just recently we’ve seen MVPA applied to dream states, reconstruction of visual experience, and pain experience all in top journals (Kay et al., 2008, Horikawa et al., 2013, Wager et al., 2013). I’d like to focus on that last one for the remainer of this post, as I think we might draw some wide-reaching conclusions for theoretical neuroscience as a whole from Wager et al’s findings.

Francesca and I were discussing the paper this morning- she’s working on a commentary for a theoretical paper concerning the role of the “pain matrix” in empathy-for-pain research. For those of you not familiar with this area, the idea is a basic simulation-theory argument-from-isomorphism. Simulation theory (ST) is just the (in)famous idea that we use our own motor system (e.g. mirror neurons) to understand the gestures of others. In a now infamous experiment Rizzolatti et al showed that motor neurons in the macaque monkey responded equally to their own gestures or the gestures of an observed other (Rizzolatti and Craighero, 2004). They argued that this structural isomorphism might represent a general neural mechanism such that social-cognitive functions can be accomplished by simply applying our own neural apparatus to work out what was going on for the external entity. With respect to phenomena such empathy for pain and ‘social pain’ (e.g. viewing a picture of someone you broke up with recently), this idea has been extended to suggest that, since a region of networks known as “the pain matrix” activates similarly when we are in pain or experience ‘social pain’, that we “really feel” pain during these states (Kross et al., 2011) [1].

In her upcoming commentary, Francesca points out an interesting finding in the paper by Wager and colleagues that I had overlooked. Wager et al apply a decoding technique in subjects undergoing painful and non-painful stimulation. Quite impressively they are then able to show that the decoded model predicts pain intensity in different scanners and various experimental manipulations. However they note that the model does not accurately predict subject’s ‘social pain’ intensity, even though the subjects did activate a similar network of regions in both the physical and social pain tasks (see image below). One conclusion from these findings it that it is surely premature to conclude that because a group of subjects may activate the same regions during two related tasks, those isomorphic activations actually represent identical neural computations [2]. In other words, arguments from structural isomorpism like ST don’t provide any actual evidence for the mechanisms they presuppose.

Figure from Wager et al demonstrating specificity of classifier for pain vs warmth and pain vs rejection. Note poor receiver operating curve (ROC) for 'social pain' (rejecter vs friend), although that contrast picks out similar regions of the 'pain matrix'.
Figure from Wager et al demonstrating specificity of classifier for pain vs warmth and pain vs rejection. Note poor receiver operating curve (ROC) for ‘social pain’ (rejecter vs friend), although that contrast picks out similar regions of the ‘pain matrix’.

To me this is exactly the right conclusion to take from Wager et al and similar decoding papers. To the extent that the assumption that MVPA identifies patterns corresponding to actual neural representations holds, we are rapidly coming to realize that a mere mean activation profile tells us relatively little about the underlying neural computations [3]. It certainly does not tell us enough to conclude much of anything on the basis that a group of subjects activate “the same brain region” for two different tasks. It is possible and even likely that just because I activate my motor cortex when viewing you move, I’m doing something quite different with those neurons than when I actually move about. And perhaps this was always the problem with simulation theory- it tries to make the leap from description (“similar brain regions activate for X and Y”) to mechanism, without actually describing a mechanism at all. I guess you could argue that this is really just a much fancier argument against reverse inference and that we don’t need MVPA to do away with simulation theory. I’m not so sure however- ST remains a strong force in a variety of domains. If decoding can actually do away with ST and arguments from isomorphism or better still, provide a reasonable mechanism for simulation, it’ll be a great day in neuroscience. One thing is clear- model based approaches will continue to improve cognitive neuroscience as we go beyond describing what brain regions activate during a task to actually explaining how those regions work together to produce behavior.

I’ve curated some enlightening responses to this post in a follow-up – worth checking for important clarifications and extensions! See also the comments on this post for a detailed explanation of MVPA techniques. 

References

Horikawa T, Tamaki M, Miyawaki Y, Kamitani Y (2013) Neural Decoding of Visual Imagery During Sleep. Science.

Kay KN, Naselaris T, Prenger RJ, Gallant JL (2008) Identifying natural images from human brain activity. Nature 452:352-355.

Kross E, Berman MG, Mischel W, Smith EE, Wager TD (2011) Social rejection shares somatosensory representations with physical pain. Proceedings of the National Academy of Sciences 108:6270-6275.

Poldrack RA (2011) Inferring mental states from neuroimaging data: from reverse inference to large-scale decoding. Neuron 72:692-697.

Rizzolatti G, Craighero L (2004) The mirror-neuron system. Annu Rev Neurosci 27:169-192.

Thompson R, Correia M, Cusack R (2011) Vascular contributions to pattern analysis: Comparing gradient and spin echo fMRI at 3T. Neuroimage 56:643-650.

Todd MT, Nystrom LE, Cohen JD (2013) Confounds in Multivariate Pattern Analysis: Theory and Rule Representation Case Study. NeuroImage.

Wager TD, Atlas LY, Lindquist MA, Roy M, Woo C-W, Kross E (2013) An fMRI-Based Neurologic Signature of Physical Pain. New England Journal of Medicine 368:1388-1397.

Weil RS, Rees G (2010) Decoding the neural correlates of consciousness. Current opinion in neurology 23:649-655.


[1] Interestingly this paper comes from the same group (Wager et al) showing that pain matrix activations do NOT predict ‘social’ pain. It will be interesting to see how they integrate this difference.

[2] Nevermind the fact that the ’pain matrix’ is not specific for pain.

[3] With all appropriate caveats regarding the ability of decoding techniques to resolve actual representations rather than confounding individual differences (Todd et al., 2013) or complex neurovascular couplings (Thompson et al., 2011).

Active-controlled, brief body-scan meditation improves somatic signal discrimination.

Here in the science blog-o-sphere we often like to run to the presses whenever a laughably bad study comes along, pointing out all the incredible feats of ignorance and sloth. However, this can lead to science-sucks cynicism syndrome (a common ailment amongst graduate students), where one begins to feel a bit like all the literature is rubbish and it just isn’t worth your time to try and do something truly proper and interesting. If you are lucky, it is at this moment that a truly excellent paper will come along at the just right time to pick up your spirits and re-invigorate your work. Today I found myself at one such low-point, struggling to figure out why my data suck, when just such a beauty of a paper appeared in my RSS reader.

data_sensing (1)The paper, “Brief body-scan meditation practice improves somatosensory perceptual decision making”, appeared in this month’s issue of Consciousness and Cognition. Laura Mirams et al set out to answer a very simple question regarding the impact of meditation training (MT) on a “somatic signal detection task” (SSDT). The study is well designed; after randomization, both groups received audio CDs with 15 minutes of daily body-scan meditation or excerpts from The Lord of The Rings. For the SSD task, participants simply report when they felt a vibration stimulus on the finger, where the baseline vibration intensity is first individually calibrated to a 50% detection rate. The authors then apply a signal-detection analysis framework to discern the sensitivity or d’ and decision criteria c.

Mirams et al found that, even when controlling for a host of baseline factors including trait mindfulness and baseline somatic attention, MT led to a greater increase in d’ driven by significantly reduced false-alarms. Although many theorists and practitioners of MT suggest a key role for interoceptive & somatic attention in related alterations of health, brain, and behavior, there exists almost no data addressing this prediction, making these findings extremely interesting. The idea that MT should impact interoception and somatosensation is very sensible- in most (novice) meditation practices it is common to focus attention to bodily sensations of, for example, the breath entering the nostril. Further, MT involves a particular kind of open, non-judgemental awareness of bodily sensations, and in general is often described to novice students as strengthening the relationship between the mind and sensations of the body. However, most existing studies on MT investigate traditional exteroceptive, top-down elements of attention such as conflict resolution and the ability to maintain attention fixation for long periods of time.

While MT certainly does involve these features, it is arguable that the interoceptive elements are more specific to the precise mechanisms of interest (they are what you actually train), whereas the attentional benefits may be more of a kind of side effect, reflecting an early emphasis in MT on establishing attention. Thus in a traditional meditation class, you might first learn some techniques to fixate your attention, and then later learn to deploy your attention to specific bodily targets (i.e. the breath) in a particular way (non-judgmentally). The goal is not necessarily to develop a super-human ability to filter distractions, but rather to change the way in which interoceptive responses to the world (i.e. emotional reactions) are perceived and responded to. This hypothesis is well reflected in the elegant study by Mirams et al; they postulate specifically that MT will lead to greater sensitivity (d’), driven by reduced false alarms rather than an increased hit-rate, reflecting a greater ability to discriminate the nature of an interoceptive signal from noise (note: see comments for clarification on this point by Steve Fleming – there is some ambiguity in interpreting the informational role of HR and FA in d’). This hypothesis not only reflects the theoretically specific contribution of MT (beyond attention training, which might be better trained by video games for example), but also postulates a mechanistically specific hypothesis to test this idea, namely that MT leads to a shift specifically in the quality of interoceptive signal processing, rather than raw attentional control.

At this point, you might ask if everyone is so sure that MT involves training interoception, why is there so little data on the topic? The authors do a great job reviewing findings (even including currently in-press papers) on interoception and MT. Currently there is one major null finding using the canonical heartbeat detection task, where advanced practitioners self-reported improved heart beat detection but in reality performed at chance. Those authors speculated that the heartbeat task might not accurately reflect the modality of interoception engaged in by practitioners. In addition a recent study investigated somatic discrimination thresholds in a cross-section of advanced practitioners and found that the ability to make meta-cognitive assessments of ones’ threshold sensitivity correlated with years of practice. A third recent study showed greater tactile sensation acuity in practitioners of Tai Chi.  One longitudinal study [PDF], a wait-list controlled fMRI investigation by Farb et al, found that a mindfulness-based stress reduction course altered BOLD responses during an attention-to-breath paradigm. Collectively these studies do suggest a role of MT in training interoception. However, as I have complained of endlessly, cross-sections cannot tell us anything about the underlying causality of the observed effects, and longitudinal studies must be active-controlled (not waitlisted) to discern mechanisms of action. Thus active-controlled longitudinal designs are desperately needed, both to determine the causality of a treatment on some observed effect, and to rule out confounds associated with motivation, demand-characteristic, and expectation. Without such a design, it is very difficult to conclude anything about the mechanisms of interest in an MT intervention.

In this regard, Mirams went above and beyond the call of duty as defined by the average paper. The choice of delivering the intervention via CD is excellent, as we can rule out instructor enthusiasm/ability confounds. Further the intervention chosen is extremely simple and well described; it is just a basic body-scan meditation without additional fluff or fanfare, lending to mechanistic specificity. Both groups were even instructed to close their eyes and sit when listening, balancing these often overlooked structural factors. In this sense, Mirams et al have controlled for instruction, motivation, intervention context, baseline trait mindfulness, and even isolated the variable of interest- only the MT group worked with interoception, though both exerted a prolonged period of sustained attention. Armed with these controls we can actually say that MT led to an alteration in interoceptive d’, through a mechanism dependent upon on the specific kind of interoceptive awareness trained in the intervention.

It is here that I have one minor nit-pick of the paper. Although the use of Lord of the Rings audiotapes is with precedent, and likely a great control for attention and motivation, you could be slightly worried that reading about Elves and Orcs is not an ideal control for listening to hours of tapes instructing you to focus on your bodily sensations, if the measure of interest involves fixating on the body. A pure active control might have been a book describing anatomy or body parts; then we could exhaustively conclude that not only is it interoception driving the findings, but the particular form of interoceptive attention deployed by meditation training. As it is, a conservative person might speculate that the observed differences reflect demand characteristics- MT participants deploy more attention to the body due to a kind of priming mechanism in the teaching. However this is an extreme nitpick and does not detract from the fact that Mirams and co-authors have made an extremely useful contribution to the literature. In the future it would be interesting to repeat the paradigm with a more body-oriented control, and perhaps also in advanced practitioners before and after an intensive retreat to see if the effect holds at later stages of training. Of course, given my interest in applying signal-detection theory to interoceptive meta-cognition, I also cannot help but wonder what the authors might have found if they’d applied a Fleming-style meta-d’ analysis to this study.

All in all, a clear study with tight methods, addressing a desperately under-developed research question, in an elegant fashion. The perfect motivation to return to my own mangled data ☺

Correcting your naughty insula: modelling respiration, pulse, and motion artifacts in fMRI

important update: Thanks to commenter “DS”, I discovered that my respiration-related data was strongly contaminated due to mechanical error. The belt we used is very susceptible to becoming uncalibrated, if the subject moves or breathes very deeply for example. When looking at the raw timecourse of respiration I could see that many subjects, included the one displayed here, show a great deal of “clipping” in the timeseries. For the final analysis I will not use the respiration regressors, but rather just the pulse and motion. Thanks DS!

As I’m working my way through my latest fMRI analysis, I thought it might be fun to share a little bit of that here. Right now i’m coding up a batch pipeline for data from my Varela-award project, in which we compared “adept” meditation practitioners with motivation, IQ, age, and gender-matched controls on a response-inhibition and error monitoring task. One thing that came up in the project proposal meeting was a worry that, since meditation practitioners spend so much time working with the breath, they might respirate differently either at rest or during the task. As I’ve written about before, respiration and other related physiological variables such as cardiac-pulsation induced motion can seriously impact your fMRI results (when your heart beats, the veins in your brain pulsate, creating slight but consistent and troublesome MR artifacts). As you might expect, these artifacts tend to be worse around the main draining veins of the brain, several of which cluster around the frontoinsular and medial-prefrontal/anterior cingulate cortices. As these regions are important for response-inhibition and are frequently reported in the meditation literature (without physiological controls), we wanted to try to control for these variables in our study.

disclaimer: i’m still learning about noise modelling, so apologies if I mess up the theory/explanation of the techniques used! I’ve left things a bit vague for that reason. See bottom of article for references for further reading. To encourage myself to post more of these “open-lab notes” posts, I’ve kept the style here very informal, so apologies for typos or snafus. 😀

To measure these signals, we used the respiration belt and pulse monitor that come standard with most modern MRI machines. The belt is just a little elastic hose that you strap around the chest wall of the subject, where it can record expansions and contractions of the chest to give a time series corresponding to respiration, and the pulse monitor a standard finger clip. Although I am not an expert on physiological noise modelling, I will do my best to explain the basic effects you want to model out of your data. These “non-white” noise signals include pulsation and respiration-induced motion (when you breath, you tend to nod your head just slightly along the z-axis), typical motion artifacts, and variability of pulsation and respiration. To do this I fed my physiological parameters into an in-house function written by Torben Lund, which incorporates a RETROICOR transformation of the pulsation and respiration timeseries. We don’t just use the raw timeseries due to signal aliasing- the phsyio data needs to be shifted to make each physiological event correspond to a TR. The function also calculates the respiratory volume time delay (RVT), a measure developed by Rasmus Birn, to model the variability in physiological parameters1. Variability in respiration and pulse volume (if one group of subjects tend to inhale sharply for some conditions but not others, for example) is more likely to drive BOLD artifacts than absolute respiratory volume or frequency (if one group of subjects tend to inhale sharply for some conditions but not others, for example). Finally, as is standard, I included the realignment parameters to model subject motion-related artifacts. Here is a shot of my monster design matrix for one subject:

DM_NVR

You can see that the first 7 columns model my conditions (correct stops, unaware errors, aware errors, false alarms, and some self-report ratings), the next 20 model the RETROICOR transformed pulse and respiration timeseries, 41 columns for RVT, 6 for realignment pars, and finally my session offsets and constant. It’s a big DM, but since we have over 1000 degrees of freedom, i’m not too worried about all the extra regressors in terms of loss of power. What would be worrisome is if for example stop activity correlated strongly with any of the nuisance variables –  we can see from the orthogonality plot that in this subject at least, that is not the case. Now lets see if we actually have anything interesting left over after we remove all that noise:

stop SPM

We can see that the Stop-related activity seems pretty reasonable, clustering around the motor and premotor cortex, bilateral insula, and DLPFC, all canonical motor inhibition regions (FWE-cluster corrected p = 0.05). This is a good sign! Now what about all those physiological regressors? Are they doing anything of value, or just sucking up our power? Here is the f-contrast over the pulse regressors:

pulse

Here we can see that the peak signal is wrapped right around the pons/upper brainstem. This makes a lot of sense- the area is full of the primary vasculature that ferries blood into and out of the brain. If I was particularly interested in getting signal from the brainstem in this project, I could use a respiration x pulse interaction regressor to better model this6. Penny et al find similar results to our cardiac F-test when comparing AR(1) with higher order AR models [7]. But since we’re really only interested in higher cortical areas, the pulse regressor should be sufficient. We can also see quite a bit of variance explained around the bilateral insula and rostral anterior cingulate. Interestingly, our stop-related activity still contained plenty of significant insula response, so we can feel better that some but not all of the signal from that region is actually functionally relevant. What about respiration?

resp

Here we see a ton of variance explained around the occipital lobe. This makes good sense- we tend to just slightly nod our head back and forth along the z-axis as we breath. What we are seeing is the motion-induced artifact of that rotation, which is most severe along the back of the head and periphery of the brain. We see a similar result for the overall motion regressors, but flipped to the front:

Ignore the above, respiration regressor is not viable due to “clipping”, see note at top of post. Glad I warned everyone that this post was “in progress” 🙂 Respiration should be a bit more global, restricted to ventricles and blood vessels.

motion

Wow, look at all the significant activity! Someone call up Nature and let them know, motion lights up the whole brain! As we would expect, the motion regressor explains a ton of uninteresting variance, particularly around the prefrontal cortex and periphery.

I still have a ways to go on this project- obviously this is just a single subject, and the results could vary wildly. But I do think even at this point we can start to see that it is quite easy and desirable to model these effects in your data (Note: we had some technical failure due to the respiration belt being a POS…) I should note that in SPM, these sources of “non-white” noise are typically modeled using an autoregressive (AR(1)) model, which is enabled in the default settings (we’ve turned it off here). However as there is evidence that this model performs poorly at faster TRs (which are the norm now), and that a noise-modelling approach can greatly improve SnR while removing artifacts, we are likely to get better performance out of a nuisance regression technique as demonstrated here [4]. The next step will be to take these regressors to a second level analysis, to examine if the meditation group has significantly more BOLD variance-explained by physiological noise than do controls. Afterwards, I will re-run the analysis without any physio parameters, to compare the results of both.

References:


1. Birn RM, Diamond JB, Smith MA, Bandettini PA.
Separating respiratory-variation-related fluctuations from neuronal-activity-related fluctuations in fMRI.
Neuroimage. 2006 Jul 15;31(4):1536-48. Epub 2006 Apr 24.

2. Brooks J.C.W., Beckmann C.F., Miller K.L. , Wise R.G., Porro C.A., Tracey I., Jenkinson M.
Physiological noise modelling for spinal functional magnetic resonance imaging studies
NeuroImage in press: DOI: doi: 10.1016/j.neuroimage.2007.09.018

3. Glover GH, Li TQ, Ress D.
Image-based method for retrospective correction of physiological motion effects in fMRI: RETROICOR.
Magn Reson Med. 2000 Jul;44(1):162-7.

4. Lund TE, Madsen KH, Sidaros K, Luo WL, Nichols TE.
Non-white noise in fMRI: does modelling have an impact?
Neuroimage. 2006 Jan 1;29(1):54-66.

5. Wise RG, Ide K, Poulin MJ, Tracey I.
Resting fluctuations in arterial carbon dioxide induce significant low frequency variations in BOLD signal.
Neuroimage. 2004 Apr;21(4):1652-64.

2. Brooks J.C.W., Beckmann C.F., Miller K.L. , Wise R.G., Porro C.A., Tracey I., Jenkinson M.
Physiological noise modelling for spinal functional magnetic resonance imaging studies
NeuroImage in press: DOI: doi: 10.1016/j.neuroimage.2007.09.018

7. Penny, W., Kiebel, S., & Friston, K. (2003). Variational Bayesian inference for fMRI time series. NeuroImage, 19(3), 727–741. doi:10.1016/S1053-8119(03)00071-5

PubPeer – A universal comment and review layer for scholarly papers?

Lately I’ve had a plethora of discussions with colleagues concerning the possible benefits of a reddit-like “democratic review layer”, which would index all scholarly papers and let authenticated users post reviews subject to karma. We’ve navel-gazed about various implementations ranging from a full out reddit clone, a wiki, or even a full blown torrent tracker with rated comments and mass piracy. So you can imagine I was pleasantly surprised to see someone actually went ahead and put together a simple app to do exactly that.

Image

Pubpeer states that it’s mission is to “create an online community that uses the publication of scientific results as an opening for fruitful discussion.” Users create accounts using an academic email address and must have at least one first-author publication to join. Once registered any user can leave anonymous comments on any article, which are themselves subject to up/down votes and replies.

My first action was of course to search for my own name:

Image

Hmm, no comments. Let’s fix that:

Image

Hah! Peer review is easy! Just kidding, I deleted this comment after testing to see if it was possible. Ostensibly this is so authors can reply to comments, but it does raise some concerns that one can just leave whatever ratings you like on your own papers. In theory with enough users, good comments will be quickly distinguished from bad, regardless of who makes them.  In theory… 

This is what an article looks like in PubPeer with a few comments:

Image

Pretty simple- any paper can be found in the database and users then leave comments associated with those papers. On the one hand I really like the simplicity and usability of PubPeer. I think any endeavor along these lines must very much follow the twitter design mentality of doing one (and only one) thing really well. I also like the use of threaded comments and upvote/downvotes but I would like to see child comments being subject to votes. I’m not sure if I favor the anonymous approach the developers went for- but I can see costs and benefits to both public and anonymous comments, so I don’t have any real suggestions there.

What I found really interesting was just to see this idea in practice. While I’ve discussed it endlessly, a few previously unforeseen worries leaped out right away. After browsing a few articles it seems (somewhat unsurprisingly) that most of the comments are pretty negative and nit-picky. Considering that most early adopters of such a system are likely to be graduate students, this isn’t too surprising. For one thing there is no such entity as a perfect paper, and graduate students are often fans of these kind of boilerplate nit-picks that form the ticks and fleas of any paper. If comments add mostly doubt and negativity to papers, it seems like the whole commenting process would become a lot of extra work for little author pay-off, since no matter what your article is going to end up looking bad.

In a traditional review, a paper’s flaws and merits are assessed privately and then the final (if accepted) paper is generally put forth as a polished piece of research that stands on it’s on merits. If a system like PubPeer were popular, becoming highly commented would almost certainly mean having tons of nitpicky and highly negative comments associated to that manuscript. This could manipulate reader perceptions- highly commented PubPeer articles would receive fewer citations regardless of their actual quality.

So that bit seems very counter-productive to me and I am not sure of the solution. It might be something similar to establishing light top-down comment moderation and a sort of “reddiquette” or user code of conduct that emphasizes fair and balanced comments (no sniping). Or, perhaps my “worry” isn’t actually troubling at all. Maybe such a system would be substantially self-policing and refreshing, shifting us from an obsession with ‘perfect papers’ to an understanding that no paper (or review) should be judged on anything but it’s own merits. Given the popularity of pun threads on reddit, i’m not convinced the wholly democratic solution will work. Whatever the result, as with most solutions to scholarly publishing, it seems clear that if PubPeer is to add substantial value to peer review then a critical mass of active users is the crucial missing ingredient.

What do you think? I’d love to hear your thoughts in the comments.

Mindfulness and neuroplasticity – summary of my recent paper.

First, let me apologize for an overlong hiatus from blogging. I submitted my PhD thesis October 1st, and it turns out that writing two papers and a thesis in the space of about three months can seriously burn out the old muse. I’ve coaxed her back through gentle offerings of chocolate, caffeine, and a bit of videogame binging. As long as I promise not to bring her within a mile of a dissertation, I believe we’re good for at least a few posts per month.

With that taken care of, I am very happy to report the successful publication of my first fMRI paper, published last month in the Journal of Neuroscience. The paper was truly a labor of love taking nearly 3 years to complete and countless hours of head-scratching work. In the end I am quite happy with the finished product, and I do believe my colleagues and I managed to produce a useful result for the field of mindfulness training and neuroplasticity.

note: this post ended up being quite long. if you are already familiar with mindfulness research, you may want to skip ahead!

Why mindfulness?

First, depending on what brought you here, you may already be wondering why mindfulness is an interesting subject, particularly for a cognitive neuroscientist. In light of the large gaps regarding our understanding of the neurobiological foundations of neuroimaging, is it really the right time to apply these complex tools to meditation?  Can we really learn anything about something as potentially ambiguous as “mindfulness”? Although we have a long way to go, and these are certainly fair questions, I do believe that the study of meditation has a lot to contribute to our understanding of cognition and plasticity.

Generally speaking, when you want to investigate some cognitive phenomena, a firm understanding of your target is essential to successful neuroimaging. Areas with years of behavioral research and concrete theoretical models make for excellent imaging subjects, as in these cases a researcher can hope to fall back on a sort of ‘ground truth’ to guide them through the neural data, which are notoriously ambiguous and difficult to interpret. Of course well-travelled roads also have their disadvantages, sometimes providing a misleading sense of security, or at least being a bit dry. While mindfulness research still has a ways to go, our understanding of these practices is rapidly evolving.

At this point it helps to stop and ask, what is meditation (and by extension, mindfulness)? The first thing to clarify is that there is no such thing as “meditation”- rather meditation is really term describing a family resemblance of highly varied practices, covering an array of both spiritual and secular practices. Meditation or “contemplative” practices have existed for more than a thousand years and are found in nearly every spiritual tradition. More recently, here in the west our unending fascination of the esoteric has lead to a popular rise in Yoga, Tai Chi, and other physically oriented contemplative practices, all of which incorporate an element of meditation.

At the simplest level of description [mindfulness] meditation is just a process of becoming aware, whether through actual sitting meditation, exercise, or daily rituals.  Meditation (as a practice) was first popularized in the west during the rise of transcendental meditation (TM). As you can see in the figure below, interest in TM lead to an early boom in research articles. This boom was not to last, as it was gradually realized that much of this initially promising research was actually the product of zealous insiders, conducted with poor controls and in some cases outright data fabrication. As TM became known as  a cult, meditation research underwent a dark age where publishing on the topic could seriously damage a research career. We can see also that around the 1990’s, this trend started to reverse as a new generation of researchers began investigating “mindfulness” meditation.

pubmed graphy thing
Sidenote: research everywhere is expanding. Shouldn’t we start controlling these highly popular “pubs over time” figures for total publishing volume? =)

It’s easy to see from the above why when Jon Kabat-Zinn re-introduced meditation to the West, he relied heavily on the medical community to develop a totally secularized intervention-oriented version of meditation strategically called “mindfulness-based stress reduction.” The arrival of MBSR was closely related to the development of mindfulness-based cognitive therapy (MBCT), a revision of cognitive-behavioral therapy utilizing mindful practices and instruction for a variety of clinical applications. Mindfulness practice is typically described as involving at least two practices; focused attention (FA) and open monitoring (OM). FA can be described as simply noticing when attention wanders from a target (the breath, the body, or a flower for example) and gently redirecting it back to that target. OM is typically (but not always) trained at an later stage, building on the attentional skills developed in FA practice to gradually develop a sense of “non-judgmental open awareness”. While a great deal of work remains to be done, initial cognitive-behavioral and clinical research on mindfulness training (MT) has shown that these practices can improve the allocation of attentional resources, reduce physiological stress, and improve emotional well-being. In the clinic MT appears to effectively improve symptoms on a variety of pathological syndromes including anxiety and depression, at least as well as standard CBT or pharmacological treatments.

Has the quality of research on meditation improved since the dark days of TM? When answering this question it is important to note two things about the state of current mindfulness research. First, while it is true that many who research MT are also practitioners, the primary scholars are researchers who started in classical areas (emotion, clinical psychiatry, cognitive neuroscience) and gradually became involved in MT research. Further, most funding today for MT research comes not from shady religious institutions, but from well-established funding bodies such as the National Institute of Health and European Research Council. It is of course important to be aware of the impact prior beliefs can have on conducting impartial research, but with respect to today’s meditation and mindfulness researchers, I believe that most if not all of the work being done is honest, quality research.

However, it is true that much of the early MT research is flawed on several levels. Indeed several meta-analyses have concluded that generally speaking, studies of MT have often utilized poor design – in one major review only 8/22 studies met criteria for meta-analysis. The reason for this is quite simple- in the absence of pilot data, investigators had to begin somewhere. Typically it doesn’t bode well to jump into unexplored territory with an expensive, large sample, fully randomized design. There just isn’t enough to go off of- how would you know which kind of process to even measure? Accordingly, the large majority of mindfulness research to date has utilized small-scale, often sub-optimal experimental design, sacrificing experimental control in order build a basic idea of the cognitive landscape. While this exploratory research provides a needed foundation for generating likely hypotheses, it is also difficult to make any strong conclusions so long as methodological issues remain.

Indeed, most of what we know about these mindfulness and neuroplasticity comes from studies of either advanced practitioners (compared to controls) or “wait-list” control studies where controls receive no intervention. On the basis of the findings from these studies, we had some idea how to target our investigation, but there remained a nagging feeling of uncertainty. Just how much of the literature would actually replicate? Does mindfulness alter attention through mere expectation and motivation biases (i.e. placebo-like confounds), or can MT actually drive functionally relevant attentional and emotional neuroplasticity, even when controlling for these confounds?

The name of the game is active-control

Research to date links mindfulness practices to alterations in health and physiology, cognitive control, emotional regulation, responsiveness to pain, and a large array of positive clinical outcomes. However, the explicit nature of mindfulness training makes for some particularly difficult methodological issues. Group cross-sectional studies, where advanced practitioners are compared to age-matched controls, cannot provide causal evidence. Indeed, it is always possible that having a big fancy brain makes you more likely to spend many years meditating, and not that meditating gives you a big fancy brain. So training studies are essential to verifying the claim that mindfulness actually leads to interesting kinds of plasticity. However, unlike with a new drug study or computerized intervention, you cannot simply provide a sugar pill to the control group. Double-blind design is impossible; by definition subjects will know they are receiving mindfulness. To actually assess the impact of MT on neural activity and behavior, we need to compare to groups doing relatively equivalent things in similar experimental contexts. We need an active control.

There is already a well-established link between measurement outcome and experimental demands. What is perhaps less appreciated is that cognitive measures, particularly reaction time, are easily biased by phenomena like the Hawthorne effect, where the amount of attention participants receive directly contributes to experimental outcome. Wait-lists simply cannot overcome these difficulties. We know for example, that simply paying controls a moderate performance-based financial reward can erase attentional reaction-time differences. If you are repeatedly told you’re training attention, then come experiment time you are likely expect this to be true and try harder than someone who has received no such instruction. The same is true of emotional tasks; subjects told frequently they are training compassion are likely to spend more time fixating on emotional stimuli, leading to inflated self-reports and responses.

I’m sure you can quickly see how it is extremely important to control for these factors if we are to isolate and understand the mechanisms important for mindfulness training. One key solution is active-control, that is providing both groups (MT and control) with a “treatment” that is at least nominally as efficacious as the thing you are interested in. Active-control allows you exclude numerous factors from your outcome, potentially including the role of social support, expectation, and experimental demands. This is exactly what we set out to do in our study, where we recruited 60 meditation-naïve subjects, scanned them on an fMRI task, randomized them to either six weeks of MT or active-control, and then measured everything again. Further, to exclude confounds relating to social interaction, we came up with a particularly unique control activity- reading Emma together.

Jane Austen as Active Control – theory of mind vs interoception

To overcome these confounds, we constructed a specialized control intervention. As it was crucial that both groups believed in their training, we needed an instructor who could match the high level of enthusiasm and experience found in our meditation instructors. We were lucky to have the help of local scholar Mette Stineberg, who suggested a customized “shared reading” group to fit our purposes. Reading groups are a fun, attention demanding exercise, with purported benefits for stress and well-being. While these claims have not been explicitly tested, what mattered most was that Mette clearly believed in their efficacy- making for a perfect control instructor. Mette holds a PhD in literature, and we knew that her 10 years of experience participating in and leading these groups would help us to exclude instructor variables from our results.

With her help, we constructed a special condition where participants completed group readings of Jane Austin’s Emma. A sensible question to ask at this point is – “why Emma?” An essential element of active control is variable isolation, or balancing your groups in such way that, with the exception of your hypothesized “active ingredient”, the two interventions are extremely similar. As MT is thought to depend on a particular kind of non-judgmental, interoceptive kind of attention, Chris and Uta Frith suggested during an early meeting that Emma might be a perfect contrast. For those of you who haven’t read the novel, the plot is brimming over with judgment-heavy theory-of-mind-type exposition. Mette further helped to ensure a contrast with MT by emphasizing discussion sessions focused on character motives. In this way we were able to ensure that both groups met for the same amount of time each week, with equivalently talented and passionate instructors, and felt that they were working towards something worthwhile. Finally, we made sure to let every participant know at recruitment that they would receive one of two treatments intended to improve attention and well-being, and that any benefits would depend upon their commitment to the practice. To help them practice at home, we created 20-minute long CD’s for both groups, one with a guided meditation and the other with a chapter from Emma.

Unlike previous active-controlled studies that typically rely on relaxation training, reading groups depend upon a high level of social-interaction. Reading together allowed us not only to exclude treatment context and expectation from our results, but also more difficult effects of social support (the “making new friends” variable). To measure this, we built a small website for participants to make daily reports of their motivation and minutes practiced that day. As you can see in the figure below, when we averaged these reports we found that not only did the reading group practice significantly more than those in MT, but that they expressed equivalent levels of motivation to practice. Anecdotally we found that reading-group members expressed a high level of satisfaction with their class, with a sub-group of about 8 even continued their meetings after our study concluded. The meditation group by comparison, did not appear to form any lasting social relationships and did not continue meeting after the study. We were very happy with these results, which suggest that it is very unlikely our results could be explained by unbalanced motivation or expectation.

Impact of MT on attention and emotion

After we established that active control was successful, the first thing to look at was some of our outside-the-scanner behavioral results. As we were interested in the effect of meditation on both attention and meta-cognition, we used an “error-awareness task” (EAT) to examine improvement in these areas. The EAT (shown below) is a typical “go-no/go” task where subjects spend most of their time pressing a button. The difficult part comes whenever a “stop-trial” occurs and subject must quickly halt their response. In the case where the subject fails to stop, they then have the opportunity to “fix” the error by pressing a second button on the trial following the error. If you’ve ever taken this kind of task, you know that it can be frustratingly difficult to stop your finger in time – the response becomes quite habitual. Using the EAT we examined the impact of MT on both controlling responses (a variable called “stop accuracy”), as well as also on meta-cognitive self-monitoring (percent “error-awareness”).

The error-awareness task

We started by looking for significant group by time interactions on stop accuracy and error-awareness, which indicate that score fluctuation on a measure was statistically greater in the treatment (MT) group than in the control group. In repeated-measures design, this type of interaction is your first indication that the treatment may have had a greater effect than the control group. When we looked at the data, it was immediately clear that while both groups improved over time (a ‘main effect’ of time), there was no interaction to be found:

Group x time analysis of SA and EA.

While it is likely that much of the increase over time can be explained by test-retest effects (i.e. simply taking the test twice), we wanted to see if any of this variance might be explained by something specific to meditation. To do this we entered stop accuracy and error-awareness into a linear model comparing the difference of slope between each group’s practice and the EAT measures. Here we saw that practice predicted stop accuracy improvement only in the meditation group, and that the this relationship was statistically greater than in the reading group:

Practice vs Stop accuracy (MT only shown). We did of course test our interaction, see paper for GLM goodness =)

These results lead us to conclude that while we did not observe a treatment effect of MT on the error-awareness task, the presence of strong time effects and MT-only correlation with practice suggested that the improvements within each group may relate to the “active ingredients” of MT but reflect motivation-driven artifacts in the reading group. Sadly we cannot conclude this firmly- we’d have needed to include a third passive control group for comparison. Thankfully this was pointed out to us by a kind reviewer, who noted that this argument is kind of like having one’s cake and eating it, so we’ll restrict ourselves to arguing that the EAT finding serves as a nice validation of the active control- both groups improved on something, and a potential indicator of a stop-related treatment mechanism.

While the EAT served as a behavioral measure of basic cognitive processes, we also wanted to examine the neural correlates of attention and emotion, to see how they might respond to mindfulness training in our intervention. For this we partnered with Karina Blair at the National Institute of Mental Health to bring the Affective Stroop task (shown below) to Denmark .

Affective Stroop Trial Scheme

The Affective Stroop Task (AST) depends on a basic “number-counting Stroop” to investigate the neural correlates of attention, emotion, and their interaction. To complete the task, your instruction is simply “count the number of numbers in the first display (of numbers), count the number of numbers in the second display, and decide which display had more number of numbers”.  As you can see in the trial example above, conflict in the task (trial-type “C”) is driven by incongruence between the Arabic numeral (e.g. “4”) and the numeracy of the display (a display of 5 “4”’s). Meanwhile, each trial has nasty or neutral emotional stimuli selected from the international affective picture system. Using the AST, we were able to examine the neural correlates of executive attention by contrasting task (B + C > A) and emotion (negative > neutral) trials.

Since we were especially interested in changes over time, we expanded on these contrasts to examine increased or decreased neural response between the first and last scans of the study. To do this we relied on two levels of analysis (standard in imaging), where at the “first” or “subject level” we examined differences between the two time points for each condition (task and emotion), within each subject. We then compared these time-related effects (contrast images) between each group using a two-sample t-test with total minutes of practice as a co-variate. To assess the impact of meditation on performing the AST, we examined reaction times in a model with factors group, time, task, and emotion. In this way we were able to examine the impact of MT on neural activity and behavior while controlling for the kinds of artifacts discussed in the previous section.

Our analysis revealed three primary findings. First, the reaction time analysis revealed a significant effect of MT on Stroop conflict, or the difference between reaction time to incongruent versus congruent trials. Further, we did not observe any effect on emotion-related RTs- although both groups sped up significantly to negative trials vs neutral (time effect), this increase was equivalent in both groups. Below you can see the stroop-conflict related RTs:

Stroop conflict result

This became particularly interesting when we examine the neural response to these conditions, and again observed a pattern of overall [BOLD signal] increases in the dorsolateral prefrontal cortex to task performance (below):

DLPFC increase to task

Interestingly, we did not observe significant overall increases to emotional stimuli  just being in the MT group didn’t seem to be enough to change emotional processing. However, when we examined correlations with amount practice and increased BOLD to negative emotion across the whole brain, we found a striking pattern of fronto-insular BOLD increases to negative images, similar to patterns seen in previous studies of compassion and mindfulness practice:

Greater association of prefrontal-insular response to negative emotion and practice
Greater association of prefrontal-insular response to negative emotion and practice.

When we put all this together, a pattern began to emerge. Overall it seemed like MT had a relatively clear impact on attention and cognitive control. Practice-correlated increases on EAT stop accuracy, reduced Affective Stroop conflict, and increases in dorsolateral prefrontal cortex responses to task all point towards plasticity at the level of executive function. In contrast our emotion-related findings suggest that alterations in affective processing occurred only in MT participants with the most practice. Given how little we know about the training trajectories of cognitive vs affective skills, we felt that this was a very interesting result.

Conclusion: the more you do, the what you get?

For us, the first conclusion from all this was that when you control for motivation and a host of other confounds, brief MT appears to primarily train attention-related processes. Secondly, alterations in affective processing seemed to require more practice to emerge. This is interesting both for understanding the neuroscience of training and for the effective application of MT in clinical settings. While a great deal of future research is needed, it is possible that the affective system may be generally more resilient to intervention than attention. It may be the case that altering affective processes depends upon and extends increasing control over executive function. Previous research suggests that attention is largely flexible, amenable to a variety of training regimens of which MT is only one beneficial intervention. However we are also becoming increasingly aware that training attention alone does not seem to directly translate into closely related benefits.

As we begin to realize that many societal and health problems cannot be solved through medication or attention-training alone, it becomes clear that techniques to increase emotional function and well-being are crucial for future development.  I am reminded of a quote overheard at the Mind & Life Summer Research Institute and attributed to the Dalai Lama. Supposedly when asked about their goal of developing meditation programs in the west, HHDL replied that, what was truly needed in the West was not “cognitive training, as (those in the west) are already too clever. What is needed rather is emotion training, to cultivate a sense of responsibility and compassion”. When we consider falling rates of empathy in medical practitioners and the link to health outcome, I think we do need to explore the role of emotional and embodied skills in supporting a wide-array of functions in cognition and well-being. While emotional development is likely to depend upon executive function, given all the recent failures to show a transfer from training these domains to even closely related ones, I suspect we need to begin including affective processes in our understanding of optimal learning. If these differences hold, then it may be important to reassess our interventions (mindful and otherwise), developing training programs that are customized in terms of the intensity, duration, and content appropriate for any given context.

Of course, rather than end on such an inspiring note, I should point out that like any study, ours is not without flaws (you’ll have to read the paper to find out how many 😉 ) and is really just an initial step. We made significant progress in replicating common neural and behavioral effects of MT while controlling for important confounds, but in retrospect the study could have been strengthened by including measures that would better distinguish the precise mechanisms, for example a measure of body awareness or empathy. Another element that struck me was how much I wish we’d had a passive control group, which could have helped flesh out how much of our time effect was instrument reliability versus motivation. As far as I am concerned, the study was a success and I am happy to have done my part to push mindfulness research towards methodological clarity and rigor. In the future I know others will continue this trend and investigate exactly what sorts of practice are needed to alter brain and behavior, and just how these benefits are accomplished.

In the near-future, I plan to give mindfulness research a rest. Not that I don’t find it fascinating or worthwhile, but rather because during the course of my PhD I’ve become a bit obsessed with interoception and meta-cognition. At present, it looks like I’ll be spending my first post-doc applying predictive coding and dynamic causal modeling to these processes. With a little luck, I might be able to build a theoretical model that could one day provide novel targets for future intervention!

Link to paper:

Cognitive-Affective Neural Plasticity following Active-Controlled Mindfulness Intervention

Thanks to all the collaborators and colleagues who made this study possible.

Special thanks to Kate Mills (@le_feufollet) for proofing this post 🙂