The Wild West of Publication Reform Is Now

It’s been a while since I’ve tried out my publication reform revolutionary hat (it comes in red!), but tonight as I was winding down I came across a post I simply could not resist. Titled “Post-publication peer review and the problem of privilege” by evolutionary ecologist Stephen Heard, the post argues that we should be cautious of post-publication review schemes insofar as they may bring about a new era of privilege in research consumption. Stephen writes:

“The packaging of papers into conventional journals, following pre-publication peer review, provides an important but under-recognized service: a signalling system that conveys information about quality and breath of relevance. I know, for instance, that I’ll be interested in almost any paper in The American Naturalist*. That the paper was judged (by peer reviewers and editors) suitable for that journal tells me two things: that it’s very good, and that it has broad implications beyond its particular topic (so I might want to read it even if it isn’t exactly in my own sub-sub-discipline). Take away that peer-review-provided signalling, and what’s left? A firehose of undifferentiated preprints, thousands of them, that are all equal candidates for my limited reading time (such that it exists). I can’t read them all (nobody can), so I have just two options: identify things to read by keyword alerts (which work only if very narrowly focused**), or identify them by author alerts. In other words, in the absence of other signals, I’ll read papers authored by people who I already know write interesting and important papers.”

In a nutshell, Stephen turns the entire argument for PPPR and publishing reform on its head. High impact[1] journals don’t represent elitism; rather they provide the no name rising young scientist a chance to have their work read and cited. This argument really made me pause for a second as it represents the polar opposite of almost my entire worldview on the scientific game and academic publishing. In my view, top-tier journals represent an entrenched system of elitism masquerading as meritocracy. They make arbitrary, journalistic decisions that exert intense power over career advancement. If anything the self-publication revolution represents the ability of a ‘nobody’ to shake the field with a powerful argument or study.

Needless to say I was at first shocked to see this argument supported by a number of other scientists on Twitter, who felt that it represented “everything wrong with the anti-journal rhetoric” spouted by loons such as myself. But then I remembered that in fact this is a version of an argument I hear almost weekly when similar discussions come up with colleagues. Ever since I wrote my pie-in-the sky self-publishing manifesto (don’t call it a manifesto!), I’ve been subjected (and rightly so!) to a kind of trial-by-peers as a de facto representative of the ‘revolution’. Most recently I was even cornered at a holiday party by a large and intimidating physicist who yelled at me that I was naïve and that “my system” would never work, for almost the exact reasons raised in Stephen’s post. So lets take a look at what these common worries are.

The Filter Problem

Bar none the first, most common complaint I hear when talking about various forms of publication reform is the “filter problem”. Stephen describes the fear quite succinctly; how will we ever find the stuff worth reading when the data deluge hits? How can we sort the wheat from the chaff, if journals don’t do it for us?

I used to take this problem seriously, and try to dream up all kinds of neato reddit-like schemes to solve it. But the truth is, it just represents a way of thinking that is rapidly becoming irrelevant. Journal based indexing isn’t a useful way to find papers. It is one signal in a sea of information and it isn’t at all clear what it actually represents. I feel like people who worry about the filter bubble tend to be more senior scientists who already struggle to keep up with the literature. For one thing, science is marked by an incessant march towards specialization. The notion that hundreds of people must read and cite our work for it to be meaningful is largely poppycock. The average paper is mostly technical, incremental, and obvious in nature. This is absolutely fine and necessary – not everything can be ground breaking and even the breakthroughs must be vetted in projects that are by definition less so. For the average paper then, being regularly cited by 20-50 people is damn good and likely represents the total target audience in that topic area. If you network to those people using social media and traditional conferences, it really isn’t hard to get your paper in their hands.

Moreover, the truly ground breaking stuff will find its audience no matter where it is published. We solve the filter problem every single day, by publically sharing and discussing papers that interest us. Arguing that we need journals to solve this problem ignores the fact that they obscure good papers behind meaningless brands, and more importantly, that scientists are perfectly capable of identifying excellent papers from content alone. You can smell a relevant paper from a mile away – regardless of where it is published! We don’t need to wait for some pie in the sky centralised service to solve this ‘problem’ (although someday once the dust settles i’m sure such things will be useful). Just go out and read some papers that interest you! Follow some interesting people on twitter. Develop a professional network worth having! And don’t buy into the idea that the whole world must read your paper for it to be worth it.

The Privilege Problem 

Ok, so lets say you agree with me to this point. Using some combination of email, social media, alerts, and RSS you feel fully capable of finding relevant stuff for your research (I do!). But your worried about this brave new world where people archive any old rubbish they like and embittered post-docs descend to sneer gleefully at it from the dark recesses of pubpeer. Won’t the new system be subject to favouritism, cults of personality, and the privilege of the elite? As Stephen says, isn’t it likely that popular persons will have their papers reviewed and promoted and all the rest will fade to the back?

The answer is yes and no. As I’ve said many times, there is no utopia. We can and must fight for a better system, but cheaters will always find away[2]. No matter how much transparency and rigor we implement, someone is going to find a loophole. And the oldest of all loopholes is good old human-corruption and hero worship. I’ve personally advocated for a day when data, code, and interpretation are all separate publishable, citable items that each contribute to ones CV. In this brave new world PPPRs would be performed by ‘review cliques’ who build up their reputation as reliable reviewers by consistently giving high marks to science objects that go on to garner acclaim, are rarely retracted, and perform well on various meta-analytic robustness indices (reproducibility, transparency, documentation, novelty, etc). They won’t replace or supplant pre-publication peer review. Rather we can ‘let a million flowers bloom’. I am all for a continuum of rigor, ranging from preregistered, confirmatory research with pre and post peer review, to fully exploratory, data driven science that is simply uploaded to a repository with a ‘use at your peril’ warning’. We don’t need to pit one reform tool against another; the brave new world will be a hybrid mixture of every tool we have at our disposal. Such a system would be massively transparent, but of course not perfect. We’d gain a cornucopia of new metrics by which to weight and reward scientists, but assuredly some clever folks would benefit more than others. We need to be ready when that day comes, aware of whatever pitfalls may bely our brave new science.

Welcome to the Wild West

Honestly though, all this kind of talk is just pointless. We all have our own opinions of what will be the best way to do science, or what will happen. For my own part I am sure some version of this sci-fi depiction is inevitable. But it doesn’t matter because the revolution is here, it’s now, it’s changing the way we consume and produce science right before our very eyes. Every day a new preprint lands on twitter with a massive splash. Just last week in my own field of cognitive neuroscience a preprint on problems in cluster inference for fMRI rocked the field, threatening to undermine thousands of existing papers while generating heated discussion in the majority of labs around the world. The week before that #cingulategate erupted when PNAS published a paper which was met with instant outcry and roundly debunked by an incredibly series of thorough post-publication reviews. A multitude of high-profile fraud cases have been exposed, and careers ended, via anonymous comments on pubpeer. People are out there, right now finding and sharing papers, discussing the ones that matter, and arguing about the ones that don’t. The future is now and we have almost no idea what shape it is taking, who the players are, or what it means for the future of funding and training. We need to stop acting like this is some fantasy future 10 years from now; we have entered the wild west and it is time to discuss what that means for science.

Authors note: In case it isn’t clear, i’m quite glad that Stephen raised the important issue of privilege. I am sure that there are problems to be rooted out and discussed along these lines, particularly in terms of the way PPPR and filtering is accomplished now in our wild west. What I object to is the idea that the future will look like it does now; we must imagine a future where science is radically improved!

[1] I’m not sure if Stephen meant high impact as I don’t know the IF of American Naturalist, maybe he just meant ‘journals I like’.

[2] Honestly this is where we need to discuss changing the hyper-capitalist system of funding and incentives surrounding publication but that is another post entirely! Maybe people wouldn’t cheat so much if we didn’t pit them against a thousand other scientists in a no-holds-barred cage match to the death.

Predictive coding and how the dynamical Bayesian brain achieves specialization and integration

Authors note: this marks the first in a new series of journal-entry style posts in which I write freely about things I like to think about. The style is meant to be informal and off the cuff, building towards a sort of socratic dialogue. Please feel free to argue or debate any point you like. These are meant to serve as exercises in writing and thinking,  to improve the quality of both and lay groundwork for future papers. 

My wife Francesca and I are spending the winter holidays vacationing in the north Italian countryside with her family. Today in our free time our discussions turned to how predictive coding and generative models can accomplish the multimodal perception that characterizes the brain. To this end Francesca asked a question we found particularly thought provoking: if the brain at all levels is only communicating forward what is not predicted (prediction error), how can you explain the functional specialization that characterizes the different senses? For example, if each sensory hierarchy is only communicating prediction errors, what explains their unique specialization in terms of e.g. the frequency, intensity, or quality of sensory inputs? Put another way, how can the different sensations be represented, if the entire brain is only communicating in one format?

We found this quite interesting, as it seems straightforward and yet the answer lies at the very basis of predictive coding schemes. To arrive at an answer we first had to lay a little groundwork in terms of information theory and basic neurobiology. What follows is a grossly oversimplified account of the basic neurobiology of perception, which serves only as a kind of philosopher’s toy example to consider the question. Please feel free to correct any gross misunderstandings.

To begin, it is clear at least according to Shannon’s theory of information, that any sensory property can be encoded in a simple system of ones and zeros (or nerve impulses). Frequency, time, intensity, and so on can all be re-described in terms of a simplistic encoding scheme. If this were not the case then modern television wouldn’t work. Second, each sensory hierarchy presumably  begins with a sensory effector, which directly transduces physical fluctuations into a neuronal code. For example, in the auditory hierarchy the cochlea contains small hairs that vibrate only to a particular frequency of sound wave. This vibration, through a complex neuro-mechanic relay, results in a tonitopic depolarization of first order neurons in the spiral ganglion.

f1
The human cochlea, a fascinating neural-mechanic apparatus to directly transduce air vibrations into neural representations.

It is here at the first-order neuron where the hierarchy presumably begins, and also where functional specialization becomes possible. It seems to us that predictive coding should say that the first neuron is simply predicting a particular pattern of inputs, which correspond directly to an expected external physical property. To try and give a toy example, say we present the brain with a series of tones, which reliably increase in frequency at 1 Hz intervals. At the lowest level the neuron will fire at a constant rate if the frequency at interval n is 1 greater than the previous interval, and will fire more or less if the frequency is greater or less than this basic expectation, creating a positive or negative prediction error (remember that the neuron should only alter its firing pattern if something unexpected happens). Since frequency here is being signaled directly by the mechanical vibration of the cochlear hairs; the first order neuron is simply predicting which frequency will be signaled. More realistically, each sensory neuron is probably only predicting whether or not a particular frequency will be signaled – we know from neurobiology that low-level neurons are basically tuned to a particular sensory feature, whereas higher level neurons encode receptive fields across multiple neurons or features. All this is to say that the first-order neuron is specialized for frequency because all it can predict is frequency; the only afferent input is the direct result of sensory transduction. The point here is that specialization in each sensory system arises in virtue of the fact that the inputs correspond directly to a physical property.

f2
Presumably, first order neurons predict the presence or absence of a particular, specialized sensory feature owing to their input. Credit: wikipedia.

Now, as one ascends higher in the hierarchy, each subsequent level is predicting the activity of the previous. The first-order neuron predicts whether a given frequency is presented, the second perhaps predicts if a receptive field is activated across several similarly tuned neurons, the third predicts a particular temporal pattern across multiple receptive fields, and so on. Each subsequent level is predicting a “hyperprior” encoding a higher order feature of the previous level. Eventually we get to a level where the prediction is no longer bound to a single sensory domain, but instead has to do with complex, non-linear interactions between multiple features. A parietal neuron thus might predict that an object in the world is a bird if it sings at a particular frequency and has a particular bodily shape.

f3
The motif of hierarchical message passing which encompasses the nervous system, according the the Free Energy principle.

If this general scheme is correct, then according to hierarchical predictive coding functional specialization primarily arises in virtue of the fact that at the lowest level each hierarchy is receiving inputs that strictly correspond to a particular feature. The cochlea is picking up fluctuations in air vibration (sound), the retina is picking up fluctuations in light frequency (light), and the skin is picking up changes in thermal amplitude and tactile frequency (touch). The specialization of each system is due to the fact that each is attempting to predict higher and higher order properties of those low-level inputs, which are by definition particular to a given sensory domain. Any further specialization in the hierarchy must then arise from the fact that higher levels of the brain predict inputs from multiple sensory systems – we might find multimodal object-related areas simply because the best hyper-prior governing nonlinear relationships between frequency and shape is an amodal or cross-model object. The actual etiology of higher-level modules is a bit more complicate than this, and requires an appeal to evolution to explain in detail, but we felt this was a generally sufficient explanation of specialization.

Nonlinearity of the world and perception: prediction as integration

At this point, we felt like we had some insight into how predictive coding can explain functional specialization without needing to appeal to special classes of cortical neurons for each sensation. Beyond the sensory effectors, the function of each system can be realized simply by means of a canonical, hierarchical prediction of each layered input, right down to the point of neurons which predict which frequency will be signaled. However, something still was missing, prompting Francesca to ask – how can this scheme explain the coherent, multi-modal, integrated perception, which characterizes conscious experience?

Indeed, we certainly do not experience perception as a series of nested predictions. All of the aforementioned machinery functions seamlessly beyond the point of awareness. In phenomenology a way to describe such influences is as being prenoetic (before knowing; see also prereflective); i.e. things that influence conscious experience without themselves appearing in experience. How then can predictive coding explain the transition from segregated, feature specific predictions to the unified percept we experience?

f4
When we arrange sensory hierarchies laterally, we see the “markov blanket” structure of the brain emerge. Each level predicts the control parameters of subsequent levels. In this way integration arises naturally from the predictive brain.

As you might guess, we already hinted at part of the answer. Imagine if instead of picturing each sensory hierarchy as an isolated pyramid, we instead arrange them such that each level is parallel to its equivalent in the ‘neighboring’ hierarchy. On this view, we can see that relatively early in each hierarchy you arrive at multi-sensory neurons that are predicting conjoint expectations over multiple sensory inputs. Conveniently, this observation matches what we actually know about the brain; audition, touch, and vision all converge in tempo-parietal association areas.

Perceptual integration is thus achieved as easily as specialization; it arises from the fact that each level predicts a hyperprior on the previous level. As one moves upwards through the hierarchy, this means that each level predicts more integrated, abstract, amodal entities. Association areas don’t predict just that a certain sight or sound will appear, but instead encode a joint expectation across both (or all) modalities. Just like the fusiform face area predicts complex, nonlinear conjunctions of lower-level visual features, multimodal areas predict nonlinear interactions between the senses.

f5
A half-cat half post, or a cat behind a post? The deep convolutional nature of the brain helps us solve this and similar nonlinear problems.

It is this nonlinearity that makes predictive schemes so powerful and attractive. To understand why, consider the task the brain must solve to be useful. Sensory impressions are not generated by simple linear inputs; certainly for perception to be useful to an organism it must process the world at a level that is relevant for that organism. This is the world of objects, persons, and things, not disjointed, individual sensory properties. When I watch a cat walk behind a fence, I don’t perceive it as two halves of a cat and a fence post, but rather as a cat hidden behind a fence. These kinds of nonlinear interactions between objects and properties of the world are ubiquitous in perception; the brain must solve not for the immediately available sensory inputs but rather the complex hidden causes underlying them. This is achieved in a similar manner to a deep convolutional network; each level performs the same canonical prediction, yet together the hierarchy will extract the best-hidden features to explain the complex interactions that produce physical sensations. In this way the predictive brain summersaults the binding problem of perception; perception is integrated precisely because conjoint hypothesis are better, more useful explanations than discrete ones. As long as the network has sufficient hierarchical depth, it will always arrive at these complex representations. It’s worth noting we can observe the flip-side of this process in common visual illusions, where the higher-order percept or prior “fills in” our actual sensory experience (e.g. when we perceive a convex circle as being lit from above).

teaser-convexconcave-01
Our higher-level, integrative priors “fill in” our perception.

Beating the homunculus: the dynamic, enactive Bayesian brain

Feeling satisfied with this, Francesca and I concluded our fun holiday discussion by thinking about some common misunderstandings this scheme might lead one into. For example, the notion of hierarchical prediction explored above might lead one to expect that there has to be a “top” level, a kind of super-homunculus who sits in the prefrontal cortex, predicting the entire sensorium. This would be an impossible solution; how could any subsystem of the brain possibly predict the entire activity of the rest? And wouldn’t that level itself need to be predicted, to be realised in perception, leading to infinite regress? Luckily the intuition that these myriad hypotheses must “come together” fundamentally misunderstands the Bayesian brain.

Remember that each level is only predicting the activity of that before it. The integrative parietal neuron is not predicting the exact sensory input at the retina; rather it is only predicting what pattern of inputs it should receive if the sensory input is an apple, or a bat, or whatever. The entire scheme is linked up this way; the individual units are just stupid predictors of immediate input. It is only when you link them all up together in a deep network, that the brain can recapitulate the complex web of causal interactions that make up the world.

This point cannot be stressed enough: predictive coding is not a localizationist enterprise. Perception does not come about because a magical brain area inverts an entire world model. It comes about in virtue of the distributed, dynamic activity of the entire brain as it constantly attempts to minimize prediction error across all levels. Ultimately the “model” is not contained “anywhere” in the brain; the entire brain itself, and the full network of connection weights, is itself the model of the world. The power to predict complex nonlinear sensory causes arises because the best overall pattern of interactions will be that which most accurately (or usefully) explains sensory inputs and the complex web of interactions which causes them. You might rephrase the famous saying as “the brain is it’s own best model of the world”.

As a final consideration, it is worth noting some misconceptions may arise from the way we ourselves perform Bayesian statistics. As an experimenter, I formalize a discrete hypothesis (or set of hypotheses) about something and then invert that model to explain data in a single step. In the brain however the “inversion” is just the constant interplay of input and feedback across the nervous system at all levels. In fact, under this distributed view (at least according to the Free Energy Principle), neural computation is deeply embodied, as actions themselves complete the inferential flow to minimize error. Thus just like neural feedback, actions function as  ‘predictions’, generated by the inferential mechanism to render the world more sensible to our predictions. This ultimately minimises prediction error just as internal model updates do, albeit in a different ‘direction of fit’ (world to model, instead of model to world). In this way the ‘model’ is distributed across the brain and body; actions themselves are as much a part of the computation as the brain itself and constitute a form of “active inference”. In fact, if one extends their view to evolution, the morphological shape of the organism is itself a kind of prior, predicting the kinds of sensations, environments, and actions the agent is likely to inhabit. This intriguing idea will be the subject of a future blog post.

Conclusion

We feel this is an extremely exciting view of the brain. The idea that an organism can achieve complex intelligence simply by embedding a simple repetitive motif within a dynamical body seems to us to be a fundamentally novel approach to the mind. In future posts and papers, we hope to further explore the notions introduced here, considering questions about “where” these embodied priors come from and what they mean for the brain, as well as the role of precision in integration.

Questions? Comments? Feel like i’m an idiot? Sound off in the comments!

Further Reading:

Brown, H., Adams, R. A., Parees, I., Edwards, M., & Friston, K. (2013). Active inference, sensory attenuation and illusions. Cognitive Processing, 14(4), 411–427. http://doi.org/10.1007/s10339-013-0571-3
Feldman, H., & Friston, K. J. (2010). Attention, Uncertainty, and Free-Energy. Frontiers in Human Neuroscience, 4. http://doi.org/10.3389/fnhum.2010.00215
Friston, K., Adams, R. A., Perrinet, L., & Breakspear, M. (2012). Perceptions as Hypotheses: Saccades as Experiments. Frontiers in Psychology, 3. http://doi.org/10.3389/fpsyg.2012.00151
Friston, K., & Kiebel, S. (2009). Predictive coding under the free-energy principle. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 364(1521), 1211–1221. http://doi.org/10.1098/rstb.2008.0300
Friston, K., Thornton, C., & Clark, A. (2012). Free-Energy Minimization and the Dark-Room Problem. Frontiers in Psychology, 3. http://doi.org/10.3389/fpsyg.2012.00130
Moran, R. J., Campo, P., Symmonds, M., Stephan, K. E., Dolan, R. J., & Friston, K. J. (2013). Free Energy, Precision and Learning: The Role of Cholinergic Neuromodulation. The Journal of Neuroscience, 33(19), 8227–8236. http://doi.org/10.1523/JNEUROSCI.4255-12.2013

 

How useful is twitter for academics, really?

Recently I was intrigued by a post on twitter conversion rates (e.g. the likelihood that a view on your tweet results in a click on the link) by journalist Derek Thompson at the Atlantic. Derek writes that although using twitter gives him great joy, he’s not sure it results in the kinds of readership his employers would feel merits the time spent on the service. Derek found that even his most viral tweets only resulted in a conversion rate of about 3% – on par with the click-through rate of east asian display ads (i.e. quite poorly in the media world). Using the recently released twitter metrics, Derek found an average conversion of around 1.5% with the best posts hitting the 3% ceiling. Ultimately he concludes that twitter seems to be great at generating buzz within the twitter-sphere but performs poorly at translating that buzz into external influence.

This struck my curiosity, as it definitely reflected my own experience tweeting out papers and tracking the resultant clicks on the actual paper itself. However, the demands of academia are quite different than that of corporate media. In my experience ‘good’ posts do exactly result in a 2-3% conversion rate, or about 30 clicks on the DOI link for every 1000 views. A typical post I consider ‘successful’ will net about 5-8k views and thus 150-200 clicks. Below are some samples of my most ‘successful’ paper tweets this month, with screen grabs of the twitter analytics for each:

Screenshot 2015-12-04 11.42.45

Screenshot 2015-12-04 11.44.00

Screenshot 2015-12-04 11.44.41

Screenshot 2015-12-04 11.45.29

Sharing each of these papers resulted in a conversion rate of about 2%, roughly in line with Derek’s experience. These are all what I would consider ‘successful’ shares, at least for me, with > 100 engagements each. You can also see that in total, external engagement (i.e., clicking the link to the paper) is below that of ‘internal’ engagement (likes, RTs, expands etc). So it does appear that on the whole twitter shares may generate a lot of internal ‘buzz’ but not necessarily reach very far beyond twitter.

For a corporate sponsor, these conversion rates are unacceptable. But for academics, I would argue the ceiling of the actually interested audience is somewhere around 200 people, which corresponds pretty well with the average paper clicks generated by successful posts. Academics are so highly specialized that i’d wager citation success is really more about making sure your paper falls into the right hands, rather than that of people working in totally different areas. I’d suggest that even for landmark ‘high impact’ papers eventual success will still be predicted more by the adoption of your select core peer group (i.e. other scientists who study vegetarian dinosaur poop in the Himalayan range). In my anecdotal experience, I would say that I more regularly find papers that grab my interest on twitter than any other experience. Moreover, unlike ‘self finds’, it seems to me papers found on twitter are more likely to be outside my immediate wheelhouse – statistics publications, genetics, etc. This is an important, if difficult to quantify type of impact.

In general, we have  to ask what exactly is a 2-3% conversion rate worth? If 200 people click my paper link, are any of them actually reading it? To probe this a bit further I used twitters new survey tool, which recently added support for multiple options, to ask my followers about how often they read papers found on twitter:

Screenshot 2015-12-04 10.58.42.png

As you can see, out of 145 responses more than 50% in total said they read papers found on twitter “Occasionally” (52%) or “Frequently” (30%). If these numbers are at all representative, I think they are pretty reassuring for the academic twitter user. As many as ~45 out of ~150 respondents say they read papers on twitter “frequently” suggesting the service has become a major source for finding interesting papers, at least among its users. All together my take away is that while you shouldn’t expect to beat the general 3% curve, the ability to get your published work on the desks of as many as 50-100 members of your core audience is pretty powerful. This is a more tangible result than ‘engagements’ or conversion rates.

Finally, it’s worth noting that the picture on how this all relates to citation behavior is murky at best. A quick surf of the scientific literature on correlating citation rate and social media exposure is inconclusive at best. Two papers found by Neurocritic are examplars of my impression of this literature, with one claiming a large effect size and the other claiming none at all. In the end I suspect how useful twitter is for sharing research really depends on several factors including your field (e.g. probably more useful for machine learning than organic chemistry) and something i’d vaguely define as ‘network quality’. Ultimately I suspect the rule is quality of followers over quantity; if your end goal is to get your papers in the hands of a round 200 engaged readers (which twitter can do for you) then having a network that actually includes those people is probably worth more than being a ‘Kardashian’ of social media.

 

Integration dynamics and choice probabilities

Very informative post – “Integration dynamics and choice probabilities”

Pillow Lab Blog

Recently in lab meeting, I presented

Sensory integration dynamics in a hierarchical network explains choice probabilities in cortical area MT

Klaus Wimmer, Albert Compte, Alex Roxin, Diogo Peixoto, Alfonso Renart & Jaime de la Rocha. Nature Communications, 2015

Wimmer et al. reanalyze and reinterpret a classic dataset of neural recordings from MT while monkeys perform a motion discrimination task. The classic result shows that the firing rates of neurons in MT are correlated with the monkey’s choice, even when the stimulus is the same. This covariation of neural activity and choice, termed choice probability, could indicate sensory variability causing behavioral variability or it could result from top-down signals that reflect the monkey’s choice. To investigate the source of choice probabilities, the authors use a two-stage, hierarchical network model of integrate and fire neurons tuned to mimic the dynamics of MT and LIP neurons and compare the model to what they find…

View original post 436 more words

A Needle in the Connectome: Neural ‘Fingerprint’ Identifies Individuals with ~93% accuracy

Much like we picture ourselves, we tend to assume that each individual brain is a bit of a unique snowflake. When running a brain imaging experiment it is common for participants or students to excitedly ask what can be revealed specifically about them given their data. Usually, we have to give a disappointing answer – not all that much, as neuroscientists typically throw this information away to get at average activation profiles set in ‘standard’ space. Now a new study published today in Nature Neuroscience suggests that our brains do indeed contain a kind of person-specific fingerprint, hidden within the functional connectome. Perhaps even more interesting, the study suggests that particular neural networks (e.g. frontoparietal and default mode) contribute the greatest amount of unique information to your ‘neuro-profile’ and also predict individual differences in fluid intelligence.

To do so lead author Emily Finn and colleagues at Yale University analysed repeated sets of functional magnetic resonance imaging (fMRI) data from 128 subjects over 6 different sessions (2 rest, 4 task), derived from the Human Connectome Project. After dividing each participant’s brain data into 268 nodes (a technique known as “parcellation”), Emily and colleagues constructed matrices of the pairwise correlation between all nodes. These correlation matrices (below, figure 1b), which encode the connectome or connectivity map for each participant, were then used in a permutation based decoding procedure to determine how accurately a participant’s connectivity pattern could be identified from the rest. This involved taking a vector of edge values (connection strengths) from a participant in the training set and correlating it with a similar vector sampled randomly with replacement from the test set (i.e. testing whether one participant’s data correlated with another’s). Pairs with the highest correlation where then labelled “1” to indicate that the algorithm assigned a matching identity between a particular train-test pair. The results of this process were then compared to a similar one in which both pairs and subject identity were randomly permuted.

Finn et al's method for identifying subjects from their connectomes.
Finn et al’s method for identifying subjects from their connectomes.

At first glance, the results are impressive:

Identification was performed using the whole-brain connectivity matrix (268 nodes; 35,778 edges), with no a priori network definitions. The success rate was 117/126 (92.9%) and 119/126 (94.4%) based on a target-database of Rest1-Rest2 and the reverse Rest2-Rest1, respectively. The success rate ranged from 68/126 (54.0%) to 110/126 (87.3%) with other database and target pairs, including rest-to-task and task-to-task comparisons.

This is a striking result – not only could identity be decoded from one resting state scan to another, but the identification also worked when going from rest to a variety of tasks and vice versa. Although classification accuracy dropped when moving between different tasks, these results were still highly significant when compared to the random shuffle, which only achieved a 5% success rate. Overall this suggests that inter-individual patterns in connectivity are highly reproducible regardless of the context from which they are obtained.

The authors then go on to perform a variety of crucial control analyses. For example, one immediate worry is that that the high identification might be driven by head motion, which strongly influences functional connectivity and is likely to show strong within-subject correlation. Another concern might be that the accuracy is driven primarily by anatomical rather than functional features. The authors test both of these alternative hypotheses, first by applying the same decoding approach to an expanded set of root-mean square motion parameters and second by testing if classification accuracy decreased as the data were increasingly smoothed (which should eliminate or reduce the contribution of anatomical features). Here the results were also encouraging: motion was totally unable to predict identity, resulting in less than 5% accuracy, and classification accuracy remained essentially the same across smoothing kernels. The authors further tested the contribution of their parcellation scheme to the more common and coarse-grained Yeo 8-network solution. This revealed that the coarser network division seemed to decrease accuracy, particularly for the fronto-parietal network, a decrease that was seemingly driven by increased reliability of the diagonal elements of the inter-subject matrix (which encode the intra-subject correlation). The authors suggest this may reflect the need for higher spatial precision to delineate individual patterns of fronto-parietal connectivity. Although this intepretation seems sensible, I do have to wonder if it conflicts with their smoothing-based control analysis. The authors also looked at how well they could identify an individual based on the variability of the BOLD signal in each region and found that although this was also significant, it showed a systematic decrease in accuracy compared to the connectomic approach. This suggests that although at least some of what makes an individual unique can be found in activity alone, connectivity data is needed for a more complete fingerprint. In a final control analysis (figure 2c below), training simultaneously on multiple data sets (for example a resting state and a task, to control inherent differences in signal length) further increased accuracy to as high as 100% in some cases.

Finn et al; networks showing most and least individuality and contributing factors.
Finn et al; networks showing most and least individuality and contributing factors. Interesting to note that sensory areas are highly common across subjects whereas fronto-parietal and mid-line show the greatest individuality!

Having established the robustness of their connectome fingerprints, Finn and colleagues then examined how much each individual cortical node contributed to the identification accuracy. This analysis revealed a particularly interesting result; frontal-parietal and midline (‘default mode’) networks showed the highest contribution (above, figure 2a), whereas sensory areas appeared to not contribute at all. This compliments their finding that the more coarse grained Yeo parcellation greatly reduced the contribution of these networks to classificaiton accuracy. Further still, Finn and colleagues linked the contributions of these networks to behavior, examining how strongly each network fingerprint predicted an overall index of fluid intelligence (g-factor). Again they found that fronto-parietal and default mode nodes were the most predictive of inter-individual differences in behaviour (in opposite directions, although I’d hesitate to interpret the sign of this finding given the global signal regression).

So what does this all mean? For starters this is a powerful demonstration of the rich individual information that can be gleaned from combining connectome analyses with high-volume data collection. The authors not only showed that resting state networks are highly stable and individual within subjects, but that these signatures can be used to delineate the way the brain responds to tasks and even behaviour. Not only is the study well powered, but the authors clearly worked hard to generalize their results across a variety of datasets while controlling for quite a few important confounds. While previous studies have reported similar findings in structural and functional data, I’m not aware of any this generalisable or specific. The task-rest signature alone confirms that both measures reflect a common neural architecture, an important finding. I could be a little concerned about other vasculature or breath-related confounds; the authors do remove such nuisance variables though, so this may not be a serious concern (though I am am not convinced their use of global signal regression to control these variables is adequate). These minor concerns none-withstanding, I found the network-specific results particularly interesting; although previous studies indicate that functional and structural heterogeneity greatly increases along the fronto-parietal axis, this study is the first demonstration to my knowledge of the extremely high predictive power embedded within those differences. It is interesting to wonder how much of this stability is important for the higher-order functions supported by these networks – indeed it seems intuitive that self-awareness, social cognition, and cognitive control depend upon acquired experiences that are highly individual. The authors conclude by suggesting that future studies may evaluate classification accuracy within an individual over many time points, raising the interesting question: Can you identify who I am tomorrow by how my brain connects today? Or am I “here today, gone tomorrow”?

Only time (and connectomics) may tell…


 

edit:

thanks to Kate Mills for pointing out this interesting PLOS ONE paper from a year ago (cited by Finn et al), that used similar methods and also found high classification accuracy, albeit with a smaller sample and fewer controls:

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0111048

 

edit2:

It seems there was a slight mistake in my understanding of the methods – see this useful comment by lead author Emily Finn for clarification:

https://neuroconscience.com/2015/10/12/a-needle-in-the-connectome-neural-fingerprint-identifies-individuals-with-93-accuracy/#comment-36506


corrections? comments? want to yell at me for being dumb? Let me know in the comments or on twitter @neuroconscience!

Depressing Quotes on Science Overflow – Reputation is the Gateway to Scientific Success

If you haven’t done so yet, go read this new E-life paper on scientific overflow, now. The authors interviewed “20 prominent principal investigators in the US, each with between 20 and 60 years of experience of basic biomedical research”, asking questions about how they view and deal with the exponential increase in scientific publications:

Our questions were grouped into four sections: (1) Have the scientists interviewed observed a decrease in the trustworthiness of science in their professional community and, if so, what are the main factors contributing to these perceptions? (2) How do the increasing concerns about the lack of robustness of scientific research affect trust in research? (3) What concerns do scientists have about science as a system? (4) What steps can be taken to ensure the trustworthiness of scientific research?

Some of the answers offer a strikingly sad view of the current state of the union:

On new open access journals, databases, etc:

There’s this proliferation of journals, a huge number of journals… and I tend not even to pay much attention to the work in some of these journals. (…) And you’re always asked to be an editor of some new journal. (…) I don’t pay much attention to them.

On the role of reputation in assessing scientific rigor and quality:

There are some people that I know to be really rigorous scientists whose work is consistently well done (…). If a paper came from a certain lab then I’m more likely to believe it than another paper that might have come from a different lab whose (…) head might be somebody that I know tends to cut corners, over-blows their conclusions, doesn’t do rigorous experiments, doesn’t appreciate the value of proper controls.

If I know that there’s a very well established laboratory with a great body of substantiated work behind it I think there is a human part of me that is inclined to expect that past quality will always be predicting future quality I think it’s a normal human thing. I try not to let that knee–jerk reaction be too strong though.

If I don’t know the authors then I will have to look more carefully at the data and (…) evaluate whether (…) I feel that the experiments were done the way I would have done them and whether there were some, if there are glaring omissions that then cast out the results (…) I mean [if] I don’t know anything I’ve never met the person or I don’t know their background, I don’t know where they trained (…) I’ve never had a discussion with them about science so I’ve never had an opportunity to gauge their level of rigour…

Another interviewee expressed scepticism about the rapid proliferation of new journals:

The journal that [a paper] is published in does make a difference to me, … I’m talking about (…) an open access journal that was started one year ago… along with five hundred other journals, (…) literally five hundred other journals, and that’s where it’s published, I have doubts about the quality of the peer review.

The cancer eating science is plain to see. If you don’t know the right people, your science is going to be viewed less favorably. If you don’t publish in the right journals, i’m not going to trust your science. It’s a massive self-feeding circle of power. The big rich labs will continue to get bigger and richer as their papers and grant applications will be treated preferentially. This massive mess of heuristic biases is turning academia into a straight up pyramid scheme. Of course, this is but a small sub-sample of the scientific community, but I can’t help but feel like these views represent a widespread opinion among the ‘old guard’ of science. Anecdotally these comments certainly mirror some of my own experiences. I’m curious to hear what others think.

The Bayesian Reproducibility Project

Fantastic post by Alexander Etz (@AlxEtz), which uses a Bayes Factor approach to summarise the results of the reproducibility project. Not only a great way to get a handle on those data but also a great introduction to Bayes Factors in general!

The Etz-Files

The Reproducibility Project was finally published this week in Science, and an outpouring ofmedia articles followed. Headlines included “More Than 50% Psychology Studies Are Questionable: Study”, “Scientists Replicated 100 Psychology Studies, and Fewer Than Half Got the Same Results”, and “More than half of psychology papers are not reproducible”.

Are these categorical conclusions warranted? If you look at the paper, it makes very clear that the results do not definitively establish effects as true or false:

After this intensive effort to reproduce a sample of published psychological findings, how many of the effects have we established are true? Zero. And how many of the effects have we established are false? Zero. Is this a limitation of the project design? No. It is the reality of doing science, even if it is not appreciated in daily practice. (p. 7)

Very well said. The point of this project was not…

View original post 3,933 more words

A walk in the park increases poor research practices and decreases reviewer critical thinking

Or so i’m going to claim because science is basically about making up whatever qualitative opinion you like and hard-selling it to a high impact journal right? Last night a paper appeared in PNAS early access entitled “Nature experience reduces rumination and subgenual prefrontal cortex activation” as a contributed submission. Like many of you I immediately felt my neurocringe brain area explode with activity as I began to smell the sickly sweet scent of gimmickry. Now I don’t have a lot of time so I was worried I wouldn’t be able to cover this paper in any detail. But never to worry, because the entire paper is literally two ANOVAs!

Don't think about it too much.
Look guys, we’re headed to PNAS! No, no, leave the critical thinking skills, we won’t be needing those where we’re going!

The paper begins with a lofty appeal to our naturalistic sensibilities; we’re increasingly living in urban areas, this trend is associated with poor mental health outcomes, and by golly-gee, shouldn’t we have a look at the brain to figure this all out? The authors set about testing their hypothesis by sending 19 people out into the remote wilderness of the Stanford University campus, or an urban setting:

The nature walk took place in a greenspace near Stanford University spanning an area ∼60 m northwest of Junipero Serra Boulevard and extending away from the street in a 5.3-km loop, including a significant stretch that is far (>1 km) from the sounds and sights of the surrounding residential area. As one proxy for urbanicity, we measured the proportion of impervious surface (e.g., asphalt, buildings, sidewalks) within 50 m of the center of the walking path (Fig. S4). Ten percent of the area within 50 m of the center of the path comprised of impervious surface (primarily of the asphalt path). Cumulative elevation gain of this walk was 155 m. The natural environment of the greenspace comprises open California grassland with scattered oaks and native shrubs, abundant birds, and occasional mammals (ground squirrels and deer). Views include neighboring, scenic hills, and distant views of the San Francisco Bay, and the southern portion of the Bay Area (including Palo Alto and Mountain View to the south, and Menlo Park and Atherton to the north). No automobiles, bicycles, or dogs are permitted on the path through the greenspace.

Wow, where can I sign up for this truly Kerouac-inspired bliss? The control group on the other hand had to survive the horrors of the palo-alto urban wasteland:

The urban walk took place on the busiest thoroughfare in nearby Palo Alto (El Camino Real), a street with three to four lanes in each direction and a steady stream of traffic. Participants were instructed to walk down one side of the street in a southeasterly direction for 2.65 km, before turning around at a specific point marked on a map. This spot was chosen as the midpoint of the walk for the urban walk to match the nature walk with respect to total distance and exercise. Participants were instructed to cross the street at a pedestrian crosswalk/stoplight, and return on the other side of the street (to simulate the loop component of the nature walk and greatly reduce repeated encounters with the same environmental stimuli on the return portion of the walk), for a total distance of 5.3 km; 76% of the area within 50mof the center of this section of El Camino was comprised of impervious surfaces (of roads and buildings) (Fig. S4). Cumulative elevation gain of this walk was 4 m. This stretch of road consists of a significant amount of noise from passing cars. Buildings are almost entirely single- to double-story units, primarily businesses (fast food establishments, cell phone stores, motels, etc.). Participants were instructed to remain on the sidewalk bordering the busy street and not to enter any buildings. Although this was the most urban area we could select for a walk that was a similar distance from the MRI facility as the nature walk, scattered trees were present on both sides of El Camino Real. Thus, our effects may represent a conservative estimate of effects of nature experience, as our urban group’s experience was not devoid of natural elements.

And they got that approved by the local ethics board? The horror!

The authors gave both groups a self-reported rumination questionnaire before and after the walk, and also acquired some arterial spin labeling MRIs. Here is where the real fun gets started – and basically ends – as the paper is almost entirely comprised of group by time ANOVAs on these two measures. I wish I could say I was suprised by what I found in the results:
whattf

That’s right folks – the key behavioral interaction of the paper – is non-significant. Measly. Minuscule. Forget about p-values for a second and consider the gall it takes to not only completely skim over this fact (nowhere in the paper is it mentioned) and head right to the delicious t-tests, but to egregiously promote this ‘finding’ in the title, abstract, and discussion as showing evidence for an effect of nature on rumination! Erroneous interaction for the win, at least with PNAS contributed submissions right?! The authors also analyzed the brain data in the same way – this time actually sticking with their NHST – and find that some brain area that has been previously related to some bad stuff showed reduced activity. And that – besides a heart rate and respiration control analyses – is it. No correlations with the (non-significant) behavior. Just pure and simple reverse inference piled on top of fallacious interpretation of a non-significant interaction. Never-mind the wonky and poorly operationalized research question!

See folks, high impact science is easy! Just have friends in the National Academy…

I’ll leave you with this gem from the methods:

“‘One participant was eliminated in analysis of self-reported rumination due to a decrease in rumination after nature experience that was 3 SDs below the mean.'”

That dude REALLY got his time’s worth from the walk. Or did the researchers maybe forget to check if anyone smoked a joint during their nature walk?

Are we watching a paradigm shift? 7 hot trends in cognitive neuroscience according to me

brainonfire

In the spirit of procrastination, here is a random list I made up of things that seem to be trending in cognitive neuroscience right now, with a quick description of each. These are purely pulled from the depths of speculation, so please do feel free to disagree. Most of these are not actually new concepts, it’s more about they way they are being used that makes them trendy areas.


7 hot trends in cognitive neuroscience according to me

Oscillations

Obviously oscillations have been around for a long time, but the rapid increase of technological sophistication for direct recordings (see for example high density cortical arrays and deep brain stimulation + recording) coupled with greater availability of MEG (plus rapid advance in MEG source reconstruction and analysis techniques) have placed large-scale neural oscillations at the forefront of cognitive neuroscience. Understanding how different frequency bands interact (e.g. phase coupling) has become a core topic of research in areas ranging from conscious awareness to memory and navigation.

Complex systems, dynamics, and emergence

Again, a concept as old as neuroscience itself, but this one seems to be piggy-backing on several trends towards a new resurgence. As neuroscience grows bored of blobology, and our analysis methods move increasingly towards modelling dynamical interactions (see above) and complex networks, our explanatory metaphors more frequently emphasize brain dynamics and emergent causation. This is a clear departure from the boxological approach that was so prevalent in the 80’s and 90’s.

Direct intervention and causal inference

Pseudo-invasive techniques like transcranial direct-current stimulation are on the rise, partially because they allow us to perform virtual lesion studies in ways not previously possible. Likewise, exponential growth of neurobiological and genetic techniques has ushered in the era of optogenetics, which allows direct manipulation of information processing at a single neuron level. Might this trend also reflect increased dissatisfaction with the correlational approaches that defined the last decade? You could also include steadily increasing interest in pharmacological neuroimaging under this category.

Computational modelling and reinforcement learning

With the hype surrounding Google’s £200 million acquisition of Deep Mind, and the recent Nobel Prize award for the discovery of grid cells, computational approaches to neuroscience are hotter than ever. Hardly a day goes by without a reinforcement learning or similar paper being published in a glossy high-impact journal. This one takes many forms but it is undeniable that model-based approaches to cognitive neuroscience are all the rage. There is also a clear surge of interest in the Bayesian Brain approach, which could almost have it’s own bullet point. But that would be too self serving  😉

Gain control

Gain control is a very basic mechanism found throughout the central nervous system. It can be understood as the neuromodulatory weighting of post-synaptic excitability, and is thought to play a critical role in contextualizing neural processing. Gain control might for example allow a neuron that usually encodes a positive prediction error to ‘flip’ its sign to encode negative prediction error under a certain context. Gain is thought to be regulated via the global interaction of neural modulators (e.g. dopamine, acetylcholine) and links basic information theoretic processes with neurobiology. This makes it a particularly desirable tool for understanding everything from perceptual decision making to basic learning and the stabilization of oscillatory dynamics. Gain control thus links computational, biological, and systems level work and is likely to continue to attract a lot of attention in the near future.

Hierarchies that are not really hierarchies

Neuroscience loves its hierarchies. For example, the Van Essen model of how visual feature detection proceeds through a hierarchy of increasingly abstract functional processes is one of the core explanatory tools used to understand vision in the brain. Currently however there is a great deal of connectomic and functional work pointing out interesting ways in which global or feedback connections can re-route and modulate processes from the ‘top’ directly to the ‘bottom’ or vice versa. It’s worth noting this trend doesn’t do away with the old notions of hierarchies, but instead just renders them a bit more complex and circular. Put another way, it is currently quite trendy to show ‘the top is the bottom’ and ‘the bottom is the top’. This partially relates to the increased emphasis on emergence and complexity discussed above. A related trend is extension of what counts as the ‘bottom’, with low-level subcortical or even first order peripheral neurons suddenly being ascribed complex abilities typically reserved for cortical processes.

Primary sensations that are not so primary

Closely related to the previous point, there is a clear trend in the perceptual sciences of being increasingly liberal about how ‘primary’ sensory areas really are. I saw this first hand at last year’s Vision Sciences Society which featured at least a dozen posters showing how one could decode tactile shape from V1, or visual frequency from A1, and so on. Again this is probably related to the overall movement towards complexity and connectionism; as we lose our reliance on modularity, we’re suddenly open to a much more general role for core sensory areas.


Interestingly I didn’t include things like multi-modal or high resolution imaging as I think they are still actually emerging and have not quite fully arrived yet. But some of these – computational and connectomic modelling for example – are clearly part and parcel of contemporary zeitgeist. It’s also very interesting to look over this list, as there seems to be a clear trend towards complexity, connectionism, and dynamics. Are we witnessing a paradigm shift in the making? Or have we just forgotten all our first principles and started mangling any old thing we can get published? If it is a shift, what should we call it? Something like ‘computational connectionism’ comes to mind. Please feel free to add points or discuss in the comments!

Short post – my science fiction vision of how science could work in the future

6922_072dSadly I missed the recent #isScienceBroken event at UCL, which from all reports was a smashing success. At the moment i’m just terribly focused on finishing up a series of intensive behavioral studies plus (as always) minimizing my free energy, so it just wasn’t possible to make it. Still, a few were interested to hear my take on things. I’m not one to try and commentate an event I wasn’t at, so instead i’ll just wax poetic for a moment about the kind of Science future i’d like to live in. Note that this has all basically been written down in my self-published article on the subject, but it might bear a re-hash as it’s fun to think about. As before, this is mostly adapted from Clay Shirky’s sci-fi vision of a totally autonomous and self-organizing science.

Science – OF THE FUTURE!

Our scene opens in the not-too distant future, say the year 2030. A gradual but steady trend towards self-publication has lead to the emergence of a new dominant research culture, wherein the vast majority of data first appear as self-archived digital manuscripts containing data, code, and descriptive-yet-conservative interpretations on centrally maintained, publicly supported research archives, prior to publication in traditional journals. These data would be subject to fully open pre-and post-publication peer review focused solely on the technical merit and clarity of the paper.

Having published your data in a totally standardized and transparent format, you would then go on write something more similar to what we currently formulate for high impact journals. Short, punchy, light on gory data details and heavy on fantastical interpretations. This would be your space to really sell what you think makes those data great – or to defend them against a firestorm of critical community comments. These would be submitted to journals like Nature and Science who would have the strictly editorial role of evaluating cohesiveness, general interest, novelty, etc. In some cases, those journals and similar entities (for example, autonomous high-reputation peer reviewing cliques) would actively solicit authors to submit such papers based on the buzz (good or bad) that their archived data had already generated. In principle multiple publishers could solicit submissions from the same buzzworthy data, effectively competing to have your paper in their journal. In this model, publishers must actively seek out the most interesting papers, fulfilling their current editorial role without jeopardizing crucial quality control mechanisms.

Is this crazy? Maybe. To be honest I see some version of this story as almost inevitable. The key bits and players may change, but I truly believe a ‘push-to-repo’ style science is an inevitable future. The key is to realize that even journals like Nature and Science play an important if lauded role, taking on editorial risk to highlight the sexiest (and least probable) findings. The real question is who will become the key players in shaping our new information economy. Will today’s major publishers die as Blockbuster did – too tied into their own profit schemes to mobilize – or will they be Netflix, adapting to the beat of progress?  By segregating the quality and impact functions of publication, we’ll ultimately arrive at a far more efficient and effective science. The question is how, and when.

note: feel free to point out in the comments examples of how this is already becoming the case (some are already doing this). 30 years is a really, really conservative estimate 🙂