Finn et al; networks showing most and least individuality and contributing factors.

A Needle in the Connectome: Neural ‘Fingerprint’ Identifies Individuals with ~93% accuracy

Much like we picture ourselves, we tend to assume that each individual brain is a bit of a unique snowflake. When running a brain imaging experiment it is common for participants or students to excitedly ask what can be revealed specifically about them given their data. Usually, we have to give a disappointing answer – not all that much, as neuroscientists typically throw this information away to get at average activation profiles set in ‘standard’ space. Now a new study published today in Nature Neuroscience suggests that our brains do indeed contain a kind of person-specific fingerprint, hidden within the functional connectome. Perhaps even more interesting, the study suggests that particular neural networks (e.g. frontoparietal and default mode) contribute the greatest amount of unique information to your ‘neuro-profile’ and also predict individual differences in fluid intelligence.

To do so lead author Emily Finn and colleagues at Yale University analysed repeated sets of functional magnetic resonance imaging (fMRI) data from 128 subjects over 6 different sessions (2 rest, 4 task), derived from the Human Connectome Project. After dividing each participant’s brain data into 268 nodes (a technique known as “parcellation”), Emily and colleagues constructed matrices of the pairwise correlation between all nodes. These correlation matrices (below, figure 1b), which encode the connectome or connectivity map for each participant, were then used in a permutation based decoding procedure to determine how accurately a participant’s connectivity pattern could be identified from the rest. This involved taking a vector of edge values (connection strengths) from a participant in the training set and correlating it with a similar vector sampled randomly with replacement from the test set (i.e. testing whether one participant’s data correlated with another’s). Pairs with the highest correlation where then labelled “1” to indicate that the algorithm assigned a matching identity between a particular train-test pair. The results of this process were then compared to a similar one in which both pairs and subject identity were randomly permuted.

Finn et al's method for identifying subjects from their connectomes.
Finn et al’s method for identifying subjects from their connectomes.

At first glance, the results are impressive:

Identification was performed using the whole-brain connectivity matrix (268 nodes; 35,778 edges), with no a priori network definitions. The success rate was 117/126 (92.9%) and 119/126 (94.4%) based on a target-database of Rest1-Rest2 and the reverse Rest2-Rest1, respectively. The success rate ranged from 68/126 (54.0%) to 110/126 (87.3%) with other database and target pairs, including rest-to-task and task-to-task comparisons.

This is a striking result – not only could identity be decoded from one resting state scan to another, but the identification also worked when going from rest to a variety of tasks and vice versa. Although classification accuracy dropped when moving between different tasks, these results were still highly significant when compared to the random shuffle, which only achieved a 5% success rate. Overall this suggests that inter-individual patterns in connectivity are highly reproducible regardless of the context from which they are obtained.

The authors then go on to perform a variety of crucial control analyses. For example, one immediate worry is that that the high identification might be driven by head motion, which strongly influences functional connectivity and is likely to show strong within-subject correlation. Another concern might be that the accuracy is driven primarily by anatomical rather than functional features. The authors test both of these alternative hypotheses, first by applying the same decoding approach to an expanded set of root-mean square motion parameters and second by testing if classification accuracy decreased as the data were increasingly smoothed (which should eliminate or reduce the contribution of anatomical features). Here the results were also encouraging: motion was totally unable to predict identity, resulting in less than 5% accuracy, and classification accuracy remained essentially the same across smoothing kernels. The authors further tested the contribution of their parcellation scheme to the more common and coarse-grained Yeo 8-network solution. This revealed that the coarser network division seemed to decrease accuracy, particularly for the fronto-parietal network, a decrease that was seemingly driven by increased reliability of the diagonal elements of the inter-subject matrix (which encode the intra-subject correlation). The authors suggest this may reflect the need for higher spatial precision to delineate individual patterns of fronto-parietal connectivity. Although this intepretation seems sensible, I do have to wonder if it conflicts with their smoothing-based control analysis. The authors also looked at how well they could identify an individual based on the variability of the BOLD signal in each region and found that although this was also significant, it showed a systematic decrease in accuracy compared to the connectomic approach. This suggests that although at least some of what makes an individual unique can be found in activity alone, connectivity data is needed for a more complete fingerprint. In a final control analysis (figure 2c below), training simultaneously on multiple data sets (for example a resting state and a task, to control inherent differences in signal length) further increased accuracy to as high as 100% in some cases.

Finn et al; networks showing most and least individuality and contributing factors.
Finn et al; networks showing most and least individuality and contributing factors. Interesting to note that sensory areas are highly common across subjects whereas fronto-parietal and mid-line show the greatest individuality!

Having established the robustness of their connectome fingerprints, Finn and colleagues then examined how much each individual cortical node contributed to the identification accuracy. This analysis revealed a particularly interesting result; frontal-parietal and midline (‘default mode’) networks showed the highest contribution (above, figure 2a), whereas sensory areas appeared to not contribute at all. This compliments their finding that the more coarse grained Yeo parcellation greatly reduced the contribution of these networks to classificaiton accuracy. Further still, Finn and colleagues linked the contributions of these networks to behavior, examining how strongly each network fingerprint predicted an overall index of fluid intelligence (g-factor). Again they found that fronto-parietal and default mode nodes were the most predictive of inter-individual differences in behaviour (in opposite directions, although I’d hesitate to interpret the sign of this finding given the global signal regression).

So what does this all mean? For starters this is a powerful demonstration of the rich individual information that can be gleaned from combining connectome analyses with high-volume data collection. The authors not only showed that resting state networks are highly stable and individual within subjects, but that these signatures can be used to delineate the way the brain responds to tasks and even behaviour. Not only is the study well powered, but the authors clearly worked hard to generalize their results across a variety of datasets while controlling for quite a few important confounds. While previous studies have reported similar findings in structural and functional data, I’m not aware of any this generalisable or specific. The task-rest signature alone confirms that both measures reflect a common neural architecture, an important finding. I could be a little concerned about other vasculature or breath-related confounds; the authors do remove such nuisance variables though, so this may not be a serious concern (though I am am not convinced their use of global signal regression to control these variables is adequate). These minor concerns none-withstanding, I found the network-specific results particularly interesting; although previous studies indicate that functional and structural heterogeneity greatly increases along the fronto-parietal axis, this study is the first demonstration to my knowledge of the extremely high predictive power embedded within those differences. It is interesting to wonder how much of this stability is important for the higher-order functions supported by these networks – indeed it seems intuitive that self-awareness, social cognition, and cognitive control depend upon acquired experiences that are highly individual. The authors conclude by suggesting that future studies may evaluate classification accuracy within an individual over many time points, raising the interesting question: Can you identify who I am tomorrow by how my brain connects today? Or am I “here today, gone tomorrow”?

Only time (and connectomics) may tell…


 

edit:

thanks to Kate Mills for pointing out this interesting PLOS ONE paper from a year ago (cited by Finn et al), that used similar methods and also found high classification accuracy, albeit with a smaller sample and fewer controls:

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0111048

 

edit2:

It seems there was a slight mistake in my understanding of the methods – see this useful comment by lead author Emily Finn for clarification:

http://neuroconscience.com/2015/10/12/a-needle-in-the-connectome-neural-fingerprint-identifies-individuals-with-93-accuracy/#comment-36506


corrections? comments? want to yell at me for being dumb? Let me know in the comments or on twitter @neuroconscience!

brainonfire

Are we watching a paradigm shift? 7 hot trends in cognitive neuroscience according to me

brainonfire

In the spirit of procrastination, here is a random list I made up of things that seem to be trending in cognitive neuroscience right now, with a quick description of each. These are purely pulled from the depths of speculation, so please do feel free to disagree. Most of these are not actually new concepts, it’s more about they way they are being used that makes them trendy areas.


7 hot trends in cognitive neuroscience according to me

Oscillations

Obviously oscillations have been around for a long time, but the rapid increase of technological sophistication for direct recordings (see for example high density cortical arrays and deep brain stimulation + recording) coupled with greater availability of MEG (plus rapid advance in MEG source reconstruction and analysis techniques) have placed large-scale neural oscillations at the forefront of cognitive neuroscience. Understanding how different frequency bands interact (e.g. phase coupling) has become a core topic of research in areas ranging from conscious awareness to memory and navigation.

Complex systems, dynamics, and emergence

Again, a concept as old as neuroscience itself, but this one seems to be piggy-backing on several trends towards a new resurgence. As neuroscience grows bored of blobology, and our analysis methods move increasingly towards modelling dynamical interactions (see above) and complex networks, our explanatory metaphors more frequently emphasize brain dynamics and emergent causation. This is a clear departure from the boxological approach that was so prevalent in the 80’s and 90’s.

Direct intervention and causal inference

Pseudo-invasive techniques like transcranial direct-current stimulation are on the rise, partially because they allow us to perform virtual lesion studies in ways not previously possible. Likewise, exponential growth of neurobiological and genetic techniques has ushered in the era of optogenetics, which allows direct manipulation of information processing at a single neuron level. Might this trend also reflect increased dissatisfaction with the correlational approaches that defined the last decade? You could also include steadily increasing interest in pharmacological neuroimaging under this category.

Computational modelling and reinforcement learning

With the hype surrounding Google’s £200 million acquisition of Deep Mind, and the recent Nobel Prize award for the discovery of grid cells, computational approaches to neuroscience are hotter than ever. Hardly a day goes by without a reinforcement learning or similar paper being published in a glossy high-impact journal. This one takes many forms but it is undeniable that model-based approaches to cognitive neuroscience are all the rage. There is also a clear surge of interest in the Bayesian Brain approach, which could almost have it’s own bullet point. But that would be too self serving  ;)

Gain control

Gain control is a very basic mechanism found throughout the central nervous system. It can be understood as the neuromodulatory weighting of post-synaptic excitability, and is thought to play a critical role in contextualizing neural processing. Gain control might for example allow a neuron that usually encodes a positive prediction error to ‘flip’ its sign to encode negative prediction error under a certain context. Gain is thought to be regulated via the global interaction of neural modulators (e.g. dopamine, acetylcholine) and links basic information theoretic processes with neurobiology. This makes it a particularly desirable tool for understanding everything from perceptual decision making to basic learning and the stabilization of oscillatory dynamics. Gain control thus links computational, biological, and systems level work and is likely to continue to attract a lot of attention in the near future.

Hierarchies that are not really hierarchies

Neuroscience loves its hierarchies. For example, the Van Essen model of how visual feature detection proceeds through a hierarchy of increasingly abstract functional processes is one of the core explanatory tools used to understand vision in the brain. Currently however there is a great deal of connectomic and functional work pointing out interesting ways in which global or feedback connections can re-route and modulate processes from the ‘top’ directly to the ‘bottom’ or vice versa. It’s worth noting this trend doesn’t do away with the old notions of hierarchies, but instead just renders them a bit more complex and circular. Put another way, it is currently quite trendy to show ‘the top is the bottom’ and ‘the bottom is the top’. This partially relates to the increased emphasis on emergence and complexity discussed above. A related trend is extension of what counts as the ‘bottom’, with low-level subcortical or even first order peripheral neurons suddenly being ascribed complex abilities typically reserved for cortical processes.

Primary sensations that are not so primary

Closely related to the previous point, there is a clear trend in the perceptual sciences of being increasingly liberal about how ‘primary’ sensory areas really are. I saw this first hand at last year’s Vision Sciences Society which featured at least a dozen posters showing how one could decode tactile shape from V1, or visual frequency from A1, and so on. Again this is probably related to the overall movement towards complexity and connectionism; as we lose our reliance on modularity, we’re suddenly open to a much more general role for core sensory areas.


Interestingly I didn’t include things like multi-modal or high resolution imaging as I think they are still actually emerging and have not quite fully arrived yet. But some of these – computational and connectomic modelling for example – are clearly part and parcel of contemporary zeitgeist. It’s also very interesting to look over this list, as there seems to be a clear trend towards complexity, connectionism, and dynamics. Are we witnessing a paradigm shift in the making? Or have we just forgotten all our first principles and started mangling any old thing we can get published? If it is a shift, what should we call it? Something like ‘computational connectionism’ comes to mind. Please feel free to add points or discuss in the comments!

Top 200 terms in cognitive neuroscience according to neurosynth

Tonight I was playing around with some of the top features in neurosynth (the searchable terms with the highest number of studies containing that term). You can find the list here, just sort by the number of studies. I excluded the top 3 terms which are boring (e.g. “image”, “response”, and “time”)  and whose extremely high weights would mess up the wordle. I then created a word-cloud weighted so that the size reflects the number of studies for each term.

Here are the top 200 terms sized according to number times reported in neurosynth’s 5809 indexed fMRI studies:

wordle

Pretty neat! These are the 200 terms the neurosynth database has the most information on, and is a pretty good overview of key concepts and topics in our field! I am sure there is something useful for everyone in there :D

Direct link to the wordle:

Wordle: neurosynth

Neurovault: a must-use tool for every neuroimaging paper!

Something that has long irked me about cognitive neuroscience is the way we share our data. I still remember the very first time I opened a brain imaging paper and was struck dumbfounded by the practice of listing activation results in endless p-value tables and selective 2D snapshots. How could anyone make sense of data this way? Now having several years experience creating such papers, I am only more dumbfounded that we continue to present our data in this way. What purpose can be served by taking a beautiful 3-dimensional result and filtering it through an awkward foci ‘photoshoot’? While there are some standards you can use to improve the 2D presentation of 3D brain maps, for example showing only peak activation and including glass-brains, this is an imperfect solution – ultimately the best way to assess the topology of a result is by directly examining the full 3D result.

Just imagine how improved every fMRI paper would be, if instead of a 20+ row table and selective snapshot, results were displayed in a simple 3D viewing widget right in the paper? Readers could assess the underlying effects at whatever statistical threshold they feel is most appropriate, and PDF versions could be printed at a particular coordinate and threshold specified by the author. Reviewers and readers alike could get a much fuller idea of the data, and meta-analysis would be vastly improved by the extensive uploading of well-categorized contrast images. More-over, all this can be easily achieved while excluding worries about privacy and intellectual property, using only group-level contrast images, which are inherently without identifying features and contain only those effects included in the published manuscript!

Now imagine my surprise when I learned that thanks to Chris Gorgolewksi and colleagues, all of this is already possible! Chris pioneered the development of neurovault.org, an extremely easy to use data sharing site backed by the International Neuroinformatics Coordinating Facility. To use it, researchers simply need to create a new ‘collection’ for their study and then upload whatever images they like. Within about 15 minutes I was able to upload both the T- and contrast-images from my group level analysis, complete with as little or as much meta-data as I felt like including. Collections can be easily linked to paper DOIs and marked as in-review, published, etc. Collections and entries can be edited or added to at any time, and the facilities allow quick documentation of imaging data at any desired level, from entire raw imaging datasets to condition-specific group contrast images. Better still, neurovault seamlessly displays these images on a 3D MNI standard brain with flexible options for thresholding, and through a hookup to neurosynth.org can even seamlessly find meta-analytic feature loadings for your images! Check out these t-map display and feature loadings for the stimulus intensity contrast for my upcoming somatosensory oddball paper, which correctly identified the modality of stimulation!

T-map in the neurovault viewer.
T-map in the neurovault viewer.
Decoded features for my contrast image.
Decoded features for my contrast image, with accurate detection of stimulation modality!

Neurovault.org doesn’t yet support embedding the viewer, but it is easy to imagine that with collaboration from publishers, future versions could be embedded directly within HTML full-text for imaging papers. For now, the site provides the perfect solution for researchers looking to make their data available to others and to more fully present their results, simply by providing supplementary links either to the neurovault collection or directly to individual viewer results. This is a tool that everyone in cognitive neuroscience should be using – I fully intend to do so in all future papers!

oh BOLD where art thou? Evidence for a “mm-scale” match between intracortical and fMRI measures.

A frequently discussed problem with functional magnetic resonance imaging is that we don’t really understand how the hemodynamic ‘activations’ measured by the technique relate to actual neuronal phenomenon. This is because fMRI measures the Blood-Oxygenation-Level Dependent (BOLD) signal, a complex vascular response to neuronal activity. As such, neuroscientists can easily get worried about all sorts of non-neural contributions to the BOLD signal, such as subjects gasping for air, pulse-related motion artefacts, and other generally uninteresting effects. We can even start to worry that out in the lab, the BOLD signal may not actually measure any particular aspect of neuronal activity, but rather some overly diluted, spatially unconstrained filter that simply lacks the key information for understanding brain processes.

Given that we generally use fMRI over neurophysiological methods (e.g. M/EEG) when we want to say something about the precise spatial generators of a cognitive process, addressing these ambiguities is of utmost importance. Accordingly a variety of recent papers have utilized multi-modal techniques, for example combining optogenetics, direct recordings, and FMRI, to assess particularly which kinds of neural events contribute to alterations in the BOLD signal and it’s spatial (mis)localization. Now a paper published today in Neuroimage addresses this question by combining high resolution 7-tesla fMRI with Electrocorticography (ECoG) to determine the spatial overlap of finger-specific somatomotor representations captured by the measures. Starting from the title’s claim that “BOLD matches neuronal activity at the mm-scale”, we can already be sure this paper will generate a great deal of interest.

From Siero et al (In Press)

As shown above, the authors managed to record high resolution (1.5mm) fMRI in 2 subjects implanted with 23 x 11mm intracranial electrode arrays during a simple finger-tapping task. Motor responses from each finger were recorded and used to generate somatotopic maps of brain responses specific to each finger. This analysis was repeated in both ECoG and fMRI, which were then spatially co-registered to one another so the authors could directly compare the spatial overlap between the two methods. What they found appears at first glance, to be quite impressive:
From Siero et al (In Press)

Here you can see the color-coded t-maps for the BOLD activations to each finger (top panel, A), the differential contrast contour maps for the ECOG (middle panel, B), and the maximum activation foci for both measures with respect to the electrode grid (bottom panel, C), in two individual subjects. Comparing the spatial maps for both the index and thumb suggests a rather strong consistency both in terms of the topology of each effect and the location of their foci. Interestingly the little finger measurements seem somewhat more displaced, although similar topographic features can be seen in both. Siero and colleagues further compute the spatial correlation (Spearman’s R) across measures for each individual finger, finding an average correlation of .54, with a range between .31-.81, a moderately high degree of overlap between the measures. Finally the optimal amount of shift needed to reduce spatial difference between the measures was computed and found to be between 1-3.1 millimetres, suggesting a slight systematic bias between ECoG and fMRI foci.

Are ‘We the BOLD’ ready to breakout the champagne and get back to scanning in comfort, spatial anxieties at ease? While this is certainly a promising result, suggesting that the BOLD signal indeed captures functionally relevant neuronal parameters with reasonable spatial accuracy, it should be noted that the result is based on a very-best-case scenario, and that a considerable degree of unique spatial variance remains for the two methods. The data presented by Siero and colleagues have undergone a number of crucial pre-processing steps that are likely to influence their results: the high degree of spatial resolution, the manual removal of draining veins, the restriction of their analysis to grey-matter voxels only, and the lack of spatial smoothing all render generalizing from these results to the standard 3-tesla whole brain pipeline difficult. Indeed, even under these best-case criteria, the results still indicate up to 3mm of systematic bias in the fMRI results. Though we can be glad the bias was systematic and not random– 3mm is still quite a lot in the brain. On this point, the authors note that the stability of the bias may point towards a systematic miss-registration of the ECoG and FMRI data and/or possible rigid-body deformations introduced by the implantation of the electrodes), issues that could be addressed in future studies. Ultimately it remains to be seen whether similar reliability can be obtained for less robust paradigms than finger wagging, obtained in the standard sub-optimal imaging scenarios. But for now I’m happy to let fMRI have its day in the sun, give or take a few millimeters.

Siero, J. C. W., Hermes, D., Hoogduin, H., Luijten, P. R., Ramsey, N. F., & Petridou, N. (2014). BOLD matches neuronal activity at the mm scale: A combined 7T fMRI and ECoG study in human sensorimotor cortex. NeuroImage. doi:10.1016/j.neuroimage.2014.07.002

 

#MethodsWeDontReport – brief thought on Jason Mitchell versus the replicators

This morning Jason Mitchell self-published an interesting essay espousing his views on why replication attempts are essentially worthless. At first I was merely interested by the fact that what would obviously become a topic of heated debate was self-published, rather than going through the long slog of a traditional academic medium. Score one for self publication, I suppose. Jason’s argument is essentially that null results don’t yield anything of value and that we should be improving the way science is conducted and reported rather than publicising our nulls. I found particularly interesting his short example list of things that he sees as critical to experimental results which nevertheless go unreported:

These experimental events, and countless more like them, go unreported in our method section for the simple fact that they are part of the shared, tacit know-how of competent researchers in my field; we also fail to report that the experimenters wore clothes and refrained from smoking throughout the session.  Someone without full possession of such know-how—perhaps because he is globally incompetent, or new to science, or even just new to neuroimaging specifically—could well be expected to bungle one or more of these important, yet unstated, experimental details.

While I don’t agree with the overall logic or conclusion of Jason’s argument (I particularly like Chris Said’s Bayesian response), I do think it raises some important or at least interesting points for discussion. For example, I agree that there is loads of potentially important stuff that goes on in the lab, particularly with human subjects and large scanners, that isn’t reported. I’m not sure to what extent that stuff can or should be reported, and I think that’s one of the interesting and under-examined topics in the larger debate. I tend to lean towards the stance that we should report just about anything we can – but of course publication pressures and tacit norms means most of it won’t be published. And probably at least some of it doesn’t need to be? But which things exactly? And how do we go about reporting stuff like how we respond to random participant questions regarding our hypothesis?

To find out, I’d love to see a list of things you can’t or don’t regularly report using the #methodswedontreport hashtag. Quite a few are starting to show up- most are funny or outright snarky (as seems to be the general mood of the response to Jason’s post), but I think a few are pretty common lab occurrences and are even though provoking in terms of their potentially serious experimental side-effects. Surely we don’t want to report all of these ‘tacit’ skills in our burgeoning method sections; the question is which ones need to be reported, and why are they important in the first place?

Effective connectivity or just plumbing? Granger Causality estimates highly reliable maps of venous drainage.

update: for an excellent response to this post, see the comment by Anil Seth at the bottom of this article. Also don’t miss the extended debate regarding the general validity of causal methods for fMRI at Russ Poldrack’s blog that followed this post. 

While the BOLD signal can be a useful measurement of brain function when used properly, the fact that it indexes blood flow rather than neural activity raises more than a few significant concerns. That is to say, when we make inferences on BOLD, we want to be sure the observed effects are causally downstream of actual neural activity, rather than the product of physiological noise such as fluctuations in breath or heart rate. This is a problem for all fMRI analyses, but is particularly tricky for resting state fMRI, where we are interested in signal fluctuations that fall in the same range as respiration and pulse. Now a new study has extended these troubles to granger causality modelling (GCM), a lag-based method for estimating causal interactions between time series, popular in the resting state literature. Just how bad is the damage?

In an article published this week in PLOS ONE, Webb and colleagues analysed over a thousand scans from the Human Connectome database, examining the reliability of GCM estimates and the proximity of the major ‘hubs’ identified by GCM with known major arteries and veins. The authors first found that GCM estimates were highly robust across participants:

Plot showing robustness of GCM estimates across 620 participants. The majority of estimated causes did not show significant differences within or between participants (black datapoints).
Plot showing robustness of GCM estimates across 620 participants. The majority of estimated causes did not show significant differences within or between participants (black datapoints).

They further report that “the largest [most robust] lags are for BOLD Granger causality differences for regions close to large veins and dural venous sinuses”. In other words, although the major ‘upstream’ and ‘downstream’ nodes estimated by GCM are highly robust across participants, regions primarily effecting other regions (e.g. causal outflow) map onto major arteries, whereas regions primarily receiving ‘inputs’  (e.g.  causal inflow) map onto veins. This pattern of ‘causation’ is very difficult to explain as anything other than a non-neural artifact, as it seems like the regions mostly ‘causing’ activity in others are exactly where you would have fresh blood coming into the brain, and regions primarily being influenced by others seem to be areas of major blood drainage. Check out the arteriogram and venogram provided by the authors:

Depiction of major arteries (top image) and veins (bottom). Not overlap with areas of greatest G-cause (below).
Depiction of major arteries (top image) and veins (bottom). Note overlap with areas of greatest G-cause (below).

Compare the above to their thresholded z-statistic map for significant granger causality; white are areas of significant g-causation overlapping with an ateriogram mask, green are significant areas overlapping with a venogram mask:

journal.pone.0084279.g005
From paper:
“Figure 5. Mean Z-statistic for significant Granger causality differences to seed ROIs. Z-statistics were averaged for a given target ROI with the 264 seed ROIs to which it exhibited significantly asymmetric Granger causality relationship. Masks are overlaid for MRI arteriograms (white) and MRI venograms (green) for voxels with greater than 2 standard deviations signal intensity of in-brain voxels in averaged images from 33 (arteriogram) and 34 (venogram) subjects. Major arterial inflow and venous outflow distributions are labeled.”

It’s fairly obvious from the above that a significant proportion of the areas typically G-causing other areas overlap with arteries, whereas areas typically being g-caused by others overlap with veins. This is a serious problem for GCM of resting state fMRI, and worse, these effects were also observed for a comprehensive range of task-based fMRI data. The authors come to the grim conclusion that “Such arterial inflow and venous drainage has a highly reproducible pattern across individuals where major arterial and venous distributions are largely invariant across subjects, giving the illusion of reliable timing differences between brain regions that may be completely unrelated to actual differences in effective connectivity”. Importantly, this isn’t the first time GCM has been called into question. A related concern is the impact of spatial variation in the lag between neural activation and the BOLD response (the ‘hemodynamic response function’, HRF) across the brain. Previous work using simultaneous intracranial and BOLD recordings has shown that due to these lags, GCM can estimate a causal pattern of A then B, whereas the actual neural activity was B then A.

This is because GCM acts in a relatively simple way; given two time-series (A & B), if a better estimate of the future state of B can be predicted by the past fluctation of both A and B than that provided by B alone, then A is said to G-cause B.  However, as we’ve already established, BOLD is a messy and complex signal, where neural activity is filtered through slow blood fluctuations that must be carefully mapped back onto to neural activity using deconvolution methods. Thus, what looks like A then B in BOLD, can actually be due to differences in HRF lags between regions – GCM is blind to this as it does not consider the underlying process producing the time-series. Worse, while this problem can be resolved by combining GCM (which is naïve to the underlying cause of the analysed time series) with an approach that de-convolves each voxel-wise time-series with a canonical HRF, the authors point out that such an approach would not resolve the concern raised here that granger causality largely picks up macroscopic temporal patterns in blood in- and out-flow:

“But even if an HRF were perfectly estimated at each voxel in the brain, the mechanism implied in our data is that similarly oxygenated blood arrives at variable time points in the brain independently of any neural activation and will affect lag-based directed functional connectivity measurements. Moreover, blood from one region may then propagate to other regions along the venous drainage pathways also independent of neural to vascular transduction. It is possible that the consistent asymmetries in Granger causality measured in our data may be related to differences in HRF latency in different brain regions, but we consider this less likely given the simpler explanation of blood moving from arteries to veins given the spatial distribution of our results.”

As for correcting for these effects, the authors suggest that a nuisance variable approach estimating vascular effects related to pulse, respiration, and breath-holding may be effective. However, they caution that the effects observed here (large scale blood inflow and drainage) take place over a timescale an order of magnitude slower than actual neural differences, and that this approach would need extremely precise estimates of the associated nuisance waveforms to prevent confounded connectivity estimates. For now, I’d advise readers to be critical of what can actually  be inferred from GCM until further research can be done, preferably using multi-modal methods capable of directly inferring the impact of vascular confounds on GCM estimates. Indeed, although I suppose am a bit biased, I have to ask if it wouldn’t be simpler to just use Dynamic Causal Modelling, a technique explicitly designed for estimating causal effects between BOLD timeseries, rather than a method originally designed to estimate influences between financial stocks.

References for further reading:

Friston, K. (2009). Causal modelling and brain connectivity in functional magnetic resonance imaging. PLoS biology, 7(2), e33. doi:10.1371/journal.pbio.1000033

Friston, K. (2011). Dynamic causal modeling and Granger causality Comments on: the identification of interacting networks in the brain using fMRI: model selection, causality and deconvolution. NeuroImage, 58(2), 303–5; author reply 310–1. doi:10.1016/j.neuroimage.2009.09.031

Friston, K., Moran, R., & Seth, A. K. (2013). Analysing connectivity with Granger causality and dynamic causal modelling. Current opinion in neurobiology, 23(2), 172–8. doi:10.1016/j.conb.2012.11.010

Webb, J. T., Ferguson, M. a., Nielsen, J. a., & Anderson, J. S. (2013). BOLD Granger Causality Reflects Vascular Anatomy. (P. A. Valdes-Sosa, Ed.)PLoS ONE, 8(12), e84279. doi:10.1371/journal.pone.0084279

Chang, C., Cunningham, J. P., & Glover, G. H. (2009). Influence of heart rate on the BOLD signal: the cardiac response function. NeuroImage, 44(3), 857–69. doi:10.1016/j.neuroimage.2008.09.029

Chang, C., & Glover, G. H. (2009). Relationship between respiration, end-tidal CO2, and BOLD signals in resting-state fMRI. NeuroImage, 47(4), 1381–93. doi:10.1016/j.neuroimage.2009.04.048

Lund, T. E., Madsen, K. H., Sidaros, K., Luo, W.-L., & Nichols, T. E. (2006). Non-white noise in fMRI: does modelling have an impact? Neuroimage, 29(1), 54–66.

David, O., Guillemain, I., Saillet, S., Reyt, S., Deransart, C., Segebarth, C., & Depaulis, A. (2008). Identifying neural drivers with functional MRI: an electrophysiological validation. PLoS biology, 6(12), 2683–97. doi:10.1371/journal.pbio.0060315

Update: This post continued into an extended debate on Russ Poldrack’s blog, where Anil Seth made the following (important) comment 

Hi this is Anil Seth.  What an excellent debate and I hope I can add few quick thoughts of my own since this is an issue close to my heart (no pub intended re vascular confounds).

First, back to the Webb et al paper. They indeed show that a vascular confound may affect GC-FMRI but only in the resting state and given suboptimal TR and averaging over diverse datasets.  Indeed I suspect that their autoregressive models may be poorly fit so that the results rather reflect a sort-of mental chronometry a la Menon, rather than GC per se.
In any case the more successful applications of GC-fMRI are those that compare experimental conditions or correlate GC with some behavioural variable (see e.g. Wen et al.http://www.ncbi.nlm.nih.gov/pubmed/22279213).  In these cases hemodynamic and vascular confounds may subtract out.
Interpreting findings like these means remembering that GC is a description of the data (i.e. DIRECTED FUNCTIONAL connectivity) and is not a direct claim about the underlying causal mechanism (e.g. like DCM, which is a measure of EFFECTIVE connectivity).  Therefore (model light) GC and (model heavy) DCM are to a large extent asking and answering different questions, and to set them in direct opposition is to misunderstand this basic point.  Karl, Ros Moran, and I make these points in a recent review (http://www.ncbi.nlm.nih.gov/pubmed/23265964).
Of course both methods are complex and ‘garbage in garbage out’ applies: naive application of either is likely to be misleading or worse.  Indeed the indirect nature of fMRI BOLD means that causal inference will be very hard.  But this doesn’t mean we shouldn’t try.  We need to move to network descriptions in order to get beyond the neo-phrenology of functional localization.  And so I am pleased to see recent developments in both DCM and GC for fMRI.  For the latter, with Barnett and Chorley I have shown that GC-FMRI is INVARIANT to hemodynamic convolution given fast sampling and low noise (http://www.ncbi.nlm.nih.gov/pubmed/23036449).  This counterintuitive finding defuses a major objection to GC-fMRI and has been established both in theory, and in a range of simulations of increasing biophysical detail.  With the development of low-TR multiband sequences, this means there is renewed hope for GC-fMRI in practice, especially when executed in an appropriate experimental design.  Barnett and I have also just released a major new GC software which avoids separate estimation of full and reduced AR models, avoiding a serious source of bias afflicting previous approaches (http://www.ncbi.nlm.nih.gov/pubmed/24200508).
Overall I am hopeful that we can move beyond premature rejection of promising methods on the grounds they fail when applied without appropriate data or sufficient care.  This applies to both GC and fMRI. These are hard problems but we will get there.

Mind-wandering and metacognition: variation between internal and external thought predicts improved error awareness

Yesterday I published my first paper on mind-wandering and metacognition, with Jonny Smallwood, Antoine Lutz, and collaborators. This was a fun project for me as I spent much of my PhD exhaustively reading the literature on mind-wandering and default mode activity, resulting in a lot of intense debate a my research center. When we had Jonny over as an opponent at my PhD defense, the chance to collaborate was simply too good to pass up. Mind-wandering is super interesting precisely because we do it so often. One of my favourite anecdotes comes from around the time I was arguing heavily for the role of the default mode in spontaneous cognition to some very skeptical colleagues.  The next day while waiting to cross the street, one such colleague rode up next to me on his bicycle and joked, “are you thinking about the default mode?” And indeed I was – meta-mind-wandering!

One thing that has really bothered me about much of the mind-wandering literature is how frequently it is presented as attention = good, mind-wandering = bad. Can you imagine how unpleasant it would be if we never mind-wandered? Just picture trying to solve a difficult task while being totally 100% focused. This kind of hyper-locking attention can easily become pathological, preventing us from altering course when our behaviour goes awry or when something internal needs to be adjusted. Mind-wandering serves many positive purposes, from stimulating our imaginations, to motivating us in boring situations with internal rewards (boring task… “ahhhh remember that nice mojito you had on the beach last year?”). Yet we largely see papers exploring the costs – mood deficits, cognitive control failure, and so on. In the meditation literature this has even been taken up to form the misguided idea that meditation should reduce or eliminate mind-wandering (even though there is almost zero evidence to this effect…)

Sometimes our theories end up reflecting our methodological apparatus, to the extent that they may not fully capture reality. I think this is part of what has happened with mind-wandering, which was originally defined in relation to difficult (and boring) attention tasks. Worse, mind-wandering is usually operationalized as a dichotomous state (“offtask” vs “ontask”) when a little introspection seems to strongly suggest it is much more of a fuzzy, dynamic transition between meta-cognitive and sensory processes. By studying mind-wandering just as the ‘amount’ (or mean) number of times you were “offtask”, we’re taking the stream of consciousness and acting as if the ‘depth’ at one point in the river is the entire story – but what about flow rate, tidal patterns, fishies, and all the dynamic variability that define the river? My idea was that one simple way get at this is by looking at the within-subject variability of mind-wandering, rather than just the overall mean “rate”.  In this way we could get some idea of the extent to which a person’s mind-wandering was fluctuating over time, rather than just categorising these events dichotomously.

The EAT task used in my study, with thought probes.
The EAT task used in my study, with thought probes.

To do this, we combined a classical meta-cognitive response inhibition paradigm, the “error awareness task” (pictured above), with standard interleaved “thought-probes” asking participants to rate on a scale of 1-7 the “subjective frequency” of task-unrelated thoughts in the task interval prior to the probe.  We then examined the relationship between the ability to perform the task or “stop accuracy” and each participant’s mean task-unrelated thought (TUT). Here we expected to replicate the well-established relationship between TUTs and attention decrements (after all, it’s difficult to inhibit your behaviour if you are thinking about the hunky babe you saw at the beach last year!). We further examined if the standard deviation of TUT (TUT variability) within each participant would predict error monitoring, reflecting a relationship between metacognition and increased fluctuation between internal and external cognition (after all, isn’t that kind of the point of metacognition?). Of course for specificity and completeness, we conducted each multiple regression analysis with the contra-variable as control predictors. Here is the key finding from the paper:

Regression analysis of TUT, TUT variability, stop accuracy, and error awareness.
Regression analysis of TUT, TUT variability, stop accuracy, and error awareness.

As you can see in the bottom right, we clearly replicated the relationship of increased overall TUT predicting poorer stop performance. Individuals who report an overall high intensity/frequency of mind-wandering unsurprisingly commit more errors. What was really interesting, however, was that the more variable a participants’ mind-wandering, the greater error-monitoring capacity (top left). This suggests that individuals who show more fluctuation between internally and externally oriented attention may be able to better enjoy the benefits of mind-wandering while simultaneously limiting its costs. Of course, these are only individual differences (i.e. correlations) and should be treated as highly preliminary. It is possible for example that participants who use more of the TUT scale have higher meta-cognitive ability in general, rather than the two variables being causally linked in the way we suggest.  We are careful to raise these and other limitations in the paper, but I do think this finding is a nice first step.

To ‘probe’ a bit further we looked at the BOLD responses to correct stops, and the parametric correlation of task-related BOLD with the TUT ratings:

Activations during correct stop trials.
Activations during correct stop trials.
Deactivations to stop trials (blue) and parametric correlation with TUT reports (red)
Deactivations to stop trials (blue) and parametric correlation with TUT reports (red)

As you can see, correct stop trials elicit a rather canonical activation pattern on the motor-inhibition and salience networks, with concurrent deactivations in visual cortex and the default mode network (second figure, blue blobs). I think of this pattern a bit like when the brain receives the ‘stop signal’ it goes, (a la Picard): “FULL STOP, MAIN VIEWER OFF, FIRE THE PHOTON TORPEDOS!”, launching into full response recovery mode. Interestingly, while we replicated the finding of medial-prefrontal co-variation with TUTS (second figure, red blob), this area was substantially more rostral than the stop-related deactivations, supporting previous findings of some degree of functional segregation between the inhibitory and mind-wandering related components of the DMN.

Finally, when examining the Aware > Unaware errors contrast, we replicated the typical salience network activations (mid-cingulate and anterior insula). Interestingly we also found strong bilateral activations in an area of the inferior parietal cortex also considered to be a part of the default mode. This finding further strengthens the link between mind-wandering and metacognition, indicating that the salience and default mode network may work in concert during conscious error awareness:

Activations to Aware > Unaware errors contrast.
Activations to Aware > Unaware errors contrast.

In all, this was a very valuable and fun study for me. As a PhD student being able to replicate the function of classic “executive, salience, and default mode” ‘resting state’ networks with a basic task was a great experience, helping me place some confidence in these labels.  I was also able to combine a classical behavioral metacognition task with some introspective thought probes, and show that they do indeed contain valuable information about task performance and related brain processes. Importantly though, we showed that the ‘content’ of the mind-wandering reports doesn’t tell the whole story of spontaneous cognition. In the future I would like to explore this idea further, perhaps by taking a time series approach to probe the dynamics of mind-wandering, using a simple continuous feedback device that participants could use throughout an experiment. In the affect literature such devices have been used to probe the dynamics of valence-arousal when participants view naturalistic movies, and I believe such an approach could reveal even greater granularity in how the experience of mind-wandering (and it’s fluctuation) interacts with cognition. Our findings suggest that the relationship between mind-wandering and task performance may be more nuanced than mere antagonism, an important finding I hope to explore in future research.

Citation: Allen M, Smallwood J, Christensen J, Gramm D, Rasmussen B, Jensen CG, Roepstorff A and Lutz A (2013) The balanced mind: the variability of task-unrelated thoughts predicts error monitoringFront. Hum. Neurosci7:743. doi: 10.3389/fnhum.2013.00743

Short post: why I share (and share often)

If you follow my social media activities I am sure by now that you know me as a compulsive share-addict. Over the past four years I have gradually increased both the amount of incoming and outgoing information I attempt to integrate on a daily basis. I start every day with a now routine ritual of scanning new publications from 60+ journals and blogs using my firehose RSS feed, as well as integrating new links from various Science sub-reddits, my curated twitter cogneuro list, my friends and colleagues on Facebook, and email lists. I then in turn curate the best, most relevant to my interests, or in some cases the most outrageous of these links and share them back to twitter, facebook, reddit, and colleagues.

Of course in doing so, a frequent response from (particularly more senior) colleagues is: why?! Why do I choose to spend the time to both take in all that information and to share it back to the world? The answer is quite simple- in sharing this stuff I get critical feedback from an ever-growing network of peers and collaborators. I can’t even count the number of times someone has pointed out something (for better or worse) that I would have otherwise missed in an article or idea. That’s right, I share it so I can see what you think of it!  In this way I have been able to not only stay up to date with the latest research and concepts, but to receive constant invaluable feedback from all of you lovely brains :). In some sense I literally distribute my cognition throughout my network – thanks for the extra neurons!

From the beginning, I have been able not only to assess the impact of this stuff, but also gain deeper and more varied insights into its meaning. When I began my PhD I had the moderate statistical training of a BSc in psychology with little direct knowledge of neuroimaging methods or theory. Frankly it was bewildering. Just figuring out which methods to pay attention to, or what problems to look out for, was a headache-inducing nightmare. But I had to start somewhere and so I started by sharing, and sharing often. As a result almost every day I get amazing feedback pointing out critical insights or flaws in the things I share that I would have otherwise missed. In this way the entire world has become my interactive classroom! It is difficult to overstate the degree to which this interaction has enriched my abilities as a scientists and thinker.

It is only natural however for more senior investigators to worry about how much time one might spend on all this. I admit in the early days of my PhD I may have spent a bit too long lingering amongst the RSS trees and twitter swarms. But then again, it is difficult to place a price on the knowledge and know-how I garnered in this process (not to mention the invaluable social capital generated in building such a network!). I am a firm believer in “power procrastination”, which is just the process of regularly switching from more difficult but higher priority to more interesting but lower priority tasks. I believe that by spending my downtime taking in and sharing information, I’m letting my ‘default mode’ take a much needed rest, while still feeding it with inputs that will actually make the hard tasks easier.

In all, on a good day I’d say I spend about 20 minutes each morning taking in inputs and another 20 minutes throughout the day sharing them. Of course some days (looking at you Fridays) I don’t always adhere to that and there are those times when I have to ‘just say no’ and wait until the evening to get into that workflow. Productivity apps like Pomodoro have helped make sure I respect the balance when particularly difficult tasks arise. All in all however, the time I spend sharing is paid back tenfold in new knowledge and deeper understanding.

Really I should be thanking all of you, the invaluable peers, friends, colleagues, followers, and readers who give me the feedback that is so totally essential to my cognitive evolution. So long as you keep reading- I’ll keep sharing! Thanks!!

Notes: I haven’t even touched on the value of blogging and post-publication peer review, which of course sums with the benefits mentioned here, but also has vastly improved my writing and comprehension skills! But that’s a topic for another post!

( don’t worry, the skim-share cycle is no replacement for deep individual learning, which I also spend plenty of time doing!)

“you are a von economo neuron!” – Francesca :)

Fun fact – I read the excellent scifi novel Accelerando just prior to beginning my PhD. In the novel the main character is an info-addict who integrates so much information he gains a “5 second” prescience on events as they unfold. He then shares these insights for free with anyone who wants them, generating billion dollar companies (of which he owns no part in) and gradually manipulating global events to bring about a technological singularity. I guess you could say I found this to be a pretty neat character :) In a serious vein though, I am a firm believer in free and open science, self-publication, and sharing-based economies. Information deserves to be free!

When is expectation not a confound? On the necessity of active controls.

Learning and plasticity are hot topics in neuroscience. Whether exploring old world wisdom or new age science fiction, the possibility that playing videogames might turn us into attention superheroes or that practicing esoteric meditation techniques might heal troubled minds is an exciting avenue for research. Indeed findings suggesting that exotic behaviors or novel therapeutic treatments might radically alter our brain (and behavior) are ripe for sensational science-fiction headlines purporting vast brain benefits.  For those of you not totally bored of methodological crisis, here we have one brewing anew. You see the standard recommendation for those interested in intervention research is the active-controlled experimental design. Unfortunately in both clinical research on psychotherapy (including meditation) and more Sci-Fi areas of brain training and gaming, use of active controls is rare at best when compared to the more convenient (but causally ineffective) passive control group. Now a new article in Perspectives in Psychological Science suggests that even standard active controls may not be sufficient to rule out confounds in the treatment effect of interest.

Why is that? And why exactly do we need  active controls in the first place? As the authors clearly point out, what you want to show with such a study is the causal efficacy of the treatment of interest. Quite simply what that means is that the thing you think should have some interesting effect should actually be causally responsible for creating that effect. If you want to argue that standing upside down for twenty minutes a day will make me better at playing videogames in Australia, it must be shown that it is actually standing upside down that causes my increased performance down under. If my improved performance on Minecraft Australian Edition is simply a product of my belief in the power of standing upside down, or my expectation that standing upside down is a great way to best kangaroo-creepers, then we have no way of determining what actually produced that performance benefit. Research on placebos and the power of expectations shows that these kinds of subjective beliefs can have a big impact on everything from attentional performance to mortality rates.

Useful flowchart from Boot et al on whether or not a study can make causal claims for treatment.
Useful flowchart from Boot et al on whether or not a study can make causal claims for treatment.

Typically researchers attempt to control for such confounds through the use of a control group performing a task as similar as possible to the intervention of interest. But how do we know participants in the two groups don’t end up with different expectations about how they should improve as a result of the training? Boot et al point out that without actually measuring these variables, we have no idea and no way of knowing for sure that expectation biases don’t produce our observed improvements. They then provide a rather clever demonstration of their concern, in an experiment where participants view videos of various cognition tests as well as videos of a training task they might later receive, in this case either the first-person shooter Unreal Tournament or the spatial puzzle game Tetris. Finally they asked the participants in each group which tests they thought they’d do better on as a result of the training video. Importantly the authors show that not only did UT and Tetris lead to significantly different expectations, but also that those expectation benefits were specific to the modality of trained and tested tasks. Thus participant who watched the action-intensive Unreal Tournament videos expected greater improvements on tests of reaction time and visual performance, whereas participants viewing Tetris rated themselves as likely to do better on tests of spatial memory.

This is a critically important finding for intervention research. Many researchers, myself included, have often thought of the expectation and demand characteristic confounds in a rather general way. Generally speaking until recently I wouldn’t have expected the expectation bias to go much beyond a general “I’m doing something effective” belief. Boot et al show that our participants are a good deal cleverer than that, forming expectations-for-improvement that map onto specific dimensions of training. This means that to the degree that an experimenter’s hypothesis can be discerned from either the training or the test, participants are likely to form unbalanced expectations.

The good news is that the authors provide several reasonable fixes for this dilemma. The first is just to actually measure participant’s expectations, specifically in relation to the measures of interest. Another useful suggestion is to run pilot studies ensuring that the two treatments do not evoke differential expectations, or similarly to check that your outcome measures are not subject to these biases. Boot and colleagues throw the proverbial glove down, daring readers to attempt experiments where the “control condition” actually elicits greater expectations yet the treatment effect is preserved. Further common concerns, such as worries about balancing false positives against false negatives, are address at length.

The entire article is a great read, timely and full of excellent suggestions for caution in future research. It also brought something I’ve been chewing on for some time quite clearly into focus. From the general perspective of learning and plasticity, I have to ask at what point is an expectation no longer a confound. Boot et al give an interesting discussion on this point, in which they suggest that even in the case of balanced expectations and positive treatment effects, an expectation dependent response (in which outcome correlates with expectation) may still give cause for concern as to the causal efficacy of the trained task. This is a difficult question that I believe ventures far into the territory of what exactly constitutes the minimal necessary features for learning. As the authors point out, placebo and expectations effects are “real” products of the brain, with serious consequences for behavior and treatment outcome. Yet even in the medical community there is a growing understanding that such effects may be essential parts of the causal machinery of healing.

Possible outcome of a training experiment, in which the control shows no dependence between expectation and outcome (top panel) and the treatment of interest shows dependence (bottom panel). Boot et al suggest that such a case may invalidate causal claims for treatment efficacy.
Possible outcome of a training experiment, in which the control shows no dependence between expectation and outcome (top panel) and the treatment of interest shows dependence (bottom panel). Boot et al suggest that such a case may invalidate causal claims for treatment efficacy.

To what extent might this also be true of learning or cognitive training? For sure we can assume that expectations shape training outcomes, otherwise the whole point about active controls would be moot. But can one really have meaningful learning if there is no expectation to improve? I realize that from an experimental/clinical perspective, the question is not “is expectation important for this outcome” but “can we observe a treatment outcome when expectations are balanced”. Still when we begin to argue that the observation of expectation-dependent responses in a balanced design might invalidate our outcome findings, I have to wonder if we are at risk of valuing methodology over phenomena. If expectation is a powerful, potentially central mechanism in the causal apparatus of learning and plasticity, we shouldn’t be surprised when even efficacious treatments are modulated by such beliefs. In the end I am left wondering if this is simply an inherent limitation in our attempt to apply the reductive apparatus of science to increasingly holistic domains.

Please do read the paper, as it is an excellent treatment of a critically ignored issue in the cognitive and clinical sciences. Anyone undertaking related work should expect this reference to appear in reviewer’s replies in the near future.

EDIT:
Professor Simons, a co-author of the paper, was nice enough to answer my question on twitter. Simons pointed out that a study that balanced expectation, found group outcome differences, and further found correlations of those differences with expectation could conclude that the treatment was causally efficacious, but that it also depends on expectations (effect + expectation). This would obviously be superior to an unbalanced designed or one without measurement of expectation, as it would actually tell us something about the importance of expectation in producing the causal outcome. Be sure to read through the very helpful FAQ they’ve posted as an addendum to the paper, which covers these questions and more in greater detail. Here is the answer to my specific question:

What if expectations are necessary for a treatment to work? Wouldn’t controlling for them eliminate the treatment effect?

No. We are not suggesting that expectations for improvement must be eliminated entirely. Rather, we are arguing for the need to equate such expectations across conditions. Expectations can still affect the treatment condition in a double-blind, placebo-controlled design. And, it is possible that some treatments will only have an effect when they interact with expectations. But, the key to that design is that the expectations are equated across the treatment and control conditions. If the treatment group outperforms the control group, and expectations are equated, then something about the treatment must have contributed to the improvement. The improvement could have resulted from the critical ingredients of the treatment alone or from some interaction between the treatment and expectations. It would be possible to isolate the treatment effect by eliminating expectations, but that is not essential in order to claim that the treatment had an effect.

In a typical psychology intervention, expectations are not equated between the treatment and control condition. If the treatment group improves more than the control group, we have no conclusive evidence that the ingredients of the treatment mattered. The improvement could have resulted from the treatment ingredients alone, from expectations alone, or from an interaction between the two. The results of any intervention that does not equate expectations across the treatment and control condition cannot provide conclusive evidence that the treatment was necessary for the improvement. It could be due to the difference in expectations alone. That is why double blind designs are ideal, and it is why psychology interventions must take steps to address the shortcomings that result from the impossibility of using a double blind design. It is possible to control for expectation differences without eliminating expectations altogether.