Skip to main content

The rise of large-scale imaging studies in psychiatry


From the initial arguments over whether 12 to 20 subjects were sufficient for an fMRI study, sample sizes in psychiatric neuroimaging studies have expanded into the tens of thousands. These large-scale imaging studies fall into several categories, each of which has specific advantages and challenges. The different study types can be grouped based on their level of control: meta-analyses, at one extreme of the spectrum, control nothing about the imaging protocol or subject selection criteria in the datasets they include, On the other hand, planned multi-site mega studies pour intense efforts into strictly having the same protocols. However, there are several other combinations possible, each of which is best used to address certain questions. The growing investment of all these studies is delivering on the promises of neuroimaging for psychiatry, and holds incredible potential for impact at the level of the individual patient. However, to realize this potential requires both standardized data-sharing efforts, so that there is more staying power in the datasets for re-use and new applications, as well as training the next generation of neuropsychiatric researchers in “Big Data” techniques in addition to traditional experimental methods. The increased access to thousands of datasets along with the needed informatics demands a new emphasis on integrative scientific methods.

Peer Review reports


The history of Magnetic Resonance Imaging (MRI) studies in biomedical and psychological research is one of increasingly widespread and sophisticated applications. From initial publications of a single or handful of subjects, a classic paper [1] argued that at least 12 subjects were needed to identify an effect in functional MRI data; indeed, analyses with fewer than 20 subjects are still common, (e.g., [2]). But in recent years, studies 600 or more scanning samples collected on a single scanner are appearing (e.g., [3]). The Enhancing Neuroimaging Genetics through Meta-Analysis (ENIGMA) meta-analysis approach used data from over 10,000 individuals, pulling from multiple legacy datasets and scanners [4]. There are a number of approaches to large-scale neuroimaging studies; they are not interchangeable, as they have complementary strengths and weaknesses. However, the growing tendency toward large-scale studies and data analysis brings with it certain calls to action for the field of neuroimaging in clinical research as it moves into the realm of recognizable “Big Data” [5].

Why has large-scale imaging come about?

The statistical mantra is that more subjects means more power; and how many subjects are needed depends, of course, at least in part on the effect size of the question under study. To ascertain where in the brain functional MRI (fMRI) signal changes are related to different conditions in a simple cognitive task, 10–15 subjects may be sufficient; studies of the neural correlates of auditory hallucinations in psychotic populations have largely pulled from smaller samples of 1–10 [6], though larger studies of 15–30 have been painstakingly collected [7, 8]. The Functional Imaging Biomedical Informatics Research Network (FBIRN), in one of the first “multi-site” fMRI studies, collected 200–300 patients with schizophrenia and controls using in the same fMRI protocol across multiple universities, motivated in part by inconsistencies found in smaller samples regarding frontal cortex function in fMRI studies of schizophrenia [9, 10]. Larger-scale neuroimaging studies are also motivated on occasion by a desire to expand the clinical picture of the sample, providing a larger variability in symptom profiles, for comparisons of clinical variation within a single-diagnosis sample [11] or for longitudinal prognosis predictions in the face of high individual variation [12].

A similar sample size of several hundred is often needed for the most basic analyses of genetic effects on imaging measures, such as testing the relationship between variation at a single genetic locus and the BOLD signal during working memory [13, 14], or to identify the combined effects of selected multiple genes on brain structural variation [15]. However, genetic effects are notoriously small and unreliable; to examine neuroimaging effects of the entire genome rather than targeted subsets of genes, data from tens of thousands of subjects are required [4]. These latter are truly large-scale studies.

Categories of large-scale imaging studies

There are some useful categories of design for imaging datasets of 100 or more subjects, considering the level of control and planning that is used. The most controlled are the planned, coordinated and often multi-site imaging studies. Less controlled are the Aggregated Mega-analyses, in which existing, often legacy datasets with similar imaging techniques and sample populations are combined for analysis. The next is Opportunistic studies, which are often seen at institutions or combinations of institutions that make their imaging data available for mining, without regard for similar sample populations or imaging protocols. Historically, the most common method, that does not control the collection of imaging data or aggregate it in one place at all, are Meta-analyses, which can be either ad hoc or prospective. We consider each of these in turn.

Planned studies

Planned studies can be large scale while being collected at a single site, using a consistent protocol for both subject recruitment and data collection. Examples of this can range from several hundred cases vs. controls [16, 17], to the Genetics of Brain Structure (GOBS) dataset of encompassing more than 1,000 subjects from a multi-pedigree study of heart disease [18], or the Philadelphia Neurodevelopmental Cohort with 1,445 imaging datasets on a single scanner [19]. In the last 15 years, the ability and the will to collect structural, functional, diffusion and perfusion imaging data across multiple imaging centers has developed, with FBIRN collecting several hundred patients with schizophrenia and controls across eight centers in the United States [10, 11], the Multisite Clinical Imaging Consortium (MCIC) doing the same across four centers [20], the Alzheimer’s Disease Neuroimaging Initiative (ADNI) collecting 800 subjects longitudinally over 50 centers [21, 22], and the PREDICT-HD data collecting over 1,400 subjects longitudinally for ten years across 32 imaging centers, even more impressive given the rarity of Huntington’s Disease [23]. This phenomenon is by no means limited to the US, of course; the Thematically Organized Psychosis (TOP) study in Norway collected over 600 imaging datasets on patients with schizophrenia, bipolar disorder and controls [24]; the IMAGEN study is an international and longitudinal imaging and genetics study of mental health in adolescents, with several thousand subjects participating [25]. These are merely examples, not an exhaustive list.

There are pros and cons to these studies, of course. Notably, these studies are very expensive. In a multi-site study, effort and expense is not simply a linear sum of doing the same but smaller study at each site; the coordination, planning, and equilibration of methods and equipment across sites [26], the infrastructure for moving data to central locations for analysis, and the time involved in keeping everyone up to date on any changes in the protocols, forms a necessary and costly overhead for these tightly organized studies. However, there is as good a guarantee as one can get in the real world that these samples are comparable across sites. Sources of variance have been minimized as much as possible. The subjects are recruited using the same criteria, the scanners are calibrated to the same levels, the protocols are identical wherever possible, the data are analyzed using the same quality assurance methods and software [27]. Just as clinical trials for FDA approval are controlled and prescribed, these kinds of studies are, effectively, FDA clinical trials methods translated as closely as possible to imaging studies--and the investment by the funding source is similarly demanding.

Aggregated mega-analyses

Aggregated Mega-analyses are studies that combine existing datasets without prior coordination. They are commonly limited to a single imaging modality, for example T1-weighted structural images or resting state fMRI, without requiring that the imaging parameters be the same across datasets. They may be limited to a particular clinical population, or that the data include a particular set of clinical assessments, without requiring that all subjects have been recruited the same way or that the same diagnostic criteria are rigidly applied. The confusion between schizophrenia and schizoaffective diagnoses is a standard example: some investigators combine both diagnoses in their samples, while others keep them separate. In a planned multi-site study, that point would be standardized across subjects; in an aggregated mega-analysis, one is often stuck with the ambiguity.

A notable example of an aggregated dataset and mega-analysis is the 1,000 Functional Connectomes project [28], which collated 1,414 subjects’ resting state fMRI data across 35 imaging centers worldwide, without regard to imaging protocol. The only constraints they set were that subjects be healthy controls over 18 and under 60 years old. They identified an underlying, robust infrastructure of resting-state signals across the brain that has been the canonical result since their publication. They were able to set the foundation for the effects of age in that range, gender, and the similarity of results across different analysis methods for the resting state brain in healthy subjects.

The advantage to this approach is clearly the ability to collate large datasets fairly cheaply, from investigators who are willing to share. The Autism Brain Imaging Data Exchange (ABIDE) dataset of autism imaging [29], for example, includes resting state fMRI data from over 1,000 participants aggregated across 16 different sites, and since its release in 2012, has resulted in six published papers on the aggregated set, with many more in preparation. In schizophrenia studies, Cota et al. (under review) [30, 31] has aggregated structural imaging data from over 1,800 subjects from eight legacy studies to evaluate gray matter loss. The recently released Consortium for Reliability and Reproducibility (CORR) dataset [32] collated structural and functional imaging data from over 1,600 subjects, available to the community. All of these are imaging and related data that have been already collected through other funding sources; the cost involved for the aggregations is mostly the personnel and time needed to send and receive the datasets for analysis, process and curate them, and the analysis time and effort. While the curation process can be lengthy, time-consuming and frustrating, it pales in comparison to the original subject recruitment and scanning costs for large scale studies.

The challenges for this method are primarily the increased variability in the images, since the imaging protocols vary widely. As noted extensively by Glover et al.[26], changes in scanning parameters can create protocol-specific deformations in the image, as well as changes in the relative contrast between tissues, and thus affect estimates of any brain measure being used, whether functional or structural. The papers cited so far on aggregated mega-analyses deal with inter-site variation in a number of ways, often through modeling site as a covariate or factor in the statistical model. However, the loss of sensitivity through increased variability has to be weighed against the increased in generalizability. More subtle effects may be lost, but those that remain are more robust.

The differences in sample characteristics are also a challenge; the sample sizes drop immediately as soon as more is required than the image and some basic demographic information. Studies were conducted with different clinical and cognitive assessments, which are generally not comparable. The advantage of power in the large sample sizes is then lost when more nuanced questions need to be asked about duration of illness, the role of cognitive deficits, or aspects of the subjects’ medical history, and the data simply aren’t there.

Opportunistic studies

Opportunistic studies refer to the growing practice of scanning centers creating institutional data repositories. The Mind Research Network in Albuquerque (NM, USA), has a policy that all scans performed on its scanner are part of its data repository for controlled sharing [33]. The four MRI scanners at the Donders Institute for Cognitive Neuroscience have provided structural imaging data for the Brain Imaging Genetics (BIG) study, from the pool of images from all college students being scanned for many other research projects [34]. The University of California, Irvine (UCI), and the University of Southern California (USC) have agreed to develop a repository of non-emergency MRI scans from both institutions [35]. The studies that come from these sorts of repositories are opportunistic in the sense that the subjects are whoever was scanned for other studies, and the imaging protocols are whatever was used for that study. In certain cases, such as the federated repositories from the Mind Research Network, and the UCI/USC network, a standardized if minimal imaging protocol can be agreed on, so that all non-emergency subjects receive the same structural and diffusion tensor imaging or resting state functional imaging protocol.

The effort behind these institutional-level data sharing methods can be extensive, requiring high-level administrative involvement, support, and assurances to develop a system for managing all the imaging data collected at an institution, as well as intrusion into the individual investigator’s methods, adding verbiage about data sharing to the protocol and consent forms, and limiting perhaps the scanning protocols that can be used. However, the repositories that result from it can be immense. The One Mind for Research project is leveraging these sorts of efforts, with the goal of collating datasets from several thousand traumatic brain injury (TBI) subjects from participating trauma centers and emergency room locations, as well as developing a registry over time of 25,000 patients seen for a suspected TBI and their computerized tomography (CT) scans [36].

While these approaches in many cases share the disadvantages of aggregated analyses—varied imaging protocols in some cases, incomplete clinical pictures in others--examples of the findings resulting from these efforts demonstrate their value. In a paper by Allen et al.[3], resting state fMRI datasets from over 600 healthy controls ranging in age from 12 to 71 years were pooled for an extensive and foundational study of the effect of age and gender on resting state networks and their interconnectedness. While this is not the size of the Functional Connectomes sample, it has the advantage of being collected using a standardized imaging protocol. In contrast, the aggregated BIG sample of 1,400 healthy controls is not standardized, but access to the original imaging data from a single institution allowed an in-depth analysis of the widely ranging imaging parameters on various gray matter measures [34].


Meta-analyses of published neuroimaging studies are important for developing consensus in the field. Standard meta-analyses combine results across smaller studies, identifying where the weight of the evidence falls in the case of conflicting results. These are post hoc meta-analyses, extracting published results and effect sizes from the literature; given the unknown number of unpublished analyses, these methods must account for the “desk drawer” phenomenon by various means. A particularly fruitful approach was developed by Laird and colleagues [37, 38], which leveraged the standardized systems for reporting fMRI results as coordinates in a three-dimensional brain space. The ability to statistically combine these coordinate-based analyses across studies has since resulted in over 400 publications with meta-analyses in schizophrenia, anxiety disorder, executive function, and many more topics [39]. Meta-analytic techniques have been applied to structural and task-based functional imaging; the rising popularity of resting state fMRI with its numerous analytical techniques [2, 3, 29, 40, 41], particularly multivariate ones, provides a particular challenge for post hoc meta-analyses. Overall, however, the advantages of post-hoc meta-analyses are well known, as are the disadvantages so these are not reviewed here.

A different approach is a prospective meta-analysis (along the lines of [42]), in which the results are not chosen from the published literature. Rather, legacy datasets are analyzed individually using a standardized statistical model, and the individual results are then pooled as in a usual meta-analysis. The largest project of this sort in neuroimaging to date is the ENIGMA project, which successfully collated statistical results from planned, consistent analyses of 10,000 subjects from 17 studies worldwide to identify the genetic effect on hippocampal volume and overall brain size [4]. The ENIGMA technique asked researchers to segment their structural imaging data into various brain region volumes using a standardized protocol (in one of two well-known software systems, Freesurfer or FSL [43, 44]), perform a standardized quality assurance protocol to remove bad data, and then they leveraged the standardized outputs from those software systems to develop scripts or programs in R [45], which would run over an entire dataset with minimal input from the dataset owner. Other than imaging data quality, image processing steps and analysis, there was very little control. The subjects could be anybody—the studies included patients with schizophrenia, attention deficit depression, autism, as well as “controls”—though they were required to be genetically Caucasian for analysis of genetic effects on the brain volumes. And notably, in this approach the data are not shared or centralized for analysis. The analysis techniques for each dataset are standardized, and the results from each dataset are what is shared. The meta-analysis is then performed on the effect sizes from each dataset.

This “crowd sourcing” approach to imaging genetics has continued successfully [46]. The ENIGMA project now has a number of collaborative working groups varying in size, exploring these same issues in distinct neuropsychiatric disorders [47]. There are papers in development on prospective meta-analyses of structural brain measures in schizophrenia, attention deficit syndrome, major depressive disorder, and bipolar disorder, with the combined expertise of hundreds of professionals in these fields.

Like the other uncontrolled designs, the prospective meta-analysis approach can be hampered by the variability in the collected data. Currently there are no standard batteries of clinical, cognitive, and socioeconomic measures that are applied to all imaging studies of schizophrenia; for example, individual studies are designed to answer specific questions, and collect the relevant data for their hypotheses. One dataset may include an extensive cognitive battery, while another does not include even basic IQ measures. Another dataset may have an equally extensive cognitive battery, but not the same one, leading to its own issues in comparability of measures. Like the mega-analysis or opportunistic study designs, these meta-analyses can end up with a “lowest common denominator” approach, including only basic covariates such as age and gender, unable to count on basic information about the duration of illness or medications being comparable or even available across the datasets.

The cost of a crowd-sourced approach, such as the ENIGMA model is in unpaid labor in many cases. ENIGMA and its subprojects are not planned multi-site studies, with staff at every site funded to work on their part of the analyses. They are almost entirely a volunteer army, of researchers willing to participate because it is a good experience, it is a project that can not be completed any other way, they believe in data sharing and aggregation, and are willing to leverage other funding sources to make it happen. While that may change in the future, the current model (as of the summer 2014) includes largely donated time and resources. That may not be an approach that supports growth in the long term, though the current level of energy for these projects from around the world is notable.

Some of the differences across study designs have been summarized in Table 1. These level descriptions are somewhat arbitrary; within any given category there will be some studies that are easier and some that are harder to perform, for example, based on the particular design and requirements.

Table 1 Comparison of study categories

The rise of large-scale studies leads to big data methods in neuroimaging

The goal of large-scale clinical neuroimaging is often the largest sample size available. Datasets from multiple research centers, multiple cities, and various countries are more likely to capture the range and variance of the clinical population than are smaller samples from a single center. Given that neuroimaging studies often pull from a limited sample of the population to begin with—subjects who are capable of undergoing neuroimaging—the more representative the sample can be, the better. All of these methods of large-scale data collection are geared toward this end, whether the goal is a genetically well-powered sample or simply capturing enough of the clinical variation. The studies presented, as examples above, have all been markedly successful in achieving these ends.

All the study designs reviewed here allow both replication and discovery. It is not only the planned studies which can test hypotheses; it is not only the less controlled categories of studies which support exploratory analyses. The ABIDE dataset, for example, while the result of aggregating legacy data, has been used to explore specific hypotheses regarding the relationship between functional connectivity of the posterior temporal sulcus and emotion recognition in autism [48]. The FBIRN III study protocol, in contrast, was designed primarily to examine the interaction between emotional distraction and working memory encoding in schizophrenia, with resting state data as an extra scan; however, the resting state data has resulted already in four papers published or under review, with more in preparation, exploring the relationships between various imaging features and disease state or clinical measure [40, 41, 49, 50]. The ADNI and COBRE multi-site datasets in Alzheimer’s Disease and schizophrenia, respectively, have both been used in “challenges” open to all comers who have data mining techniques to identify who has the disease and who doesn’t, in support of new diagnostic techniques [51, 52]. The original study designers and data collectors for any given project cannot have all possible analysis and statistical techniques at their fingertips; therefore, these data repositories are immensely valuable as ongoing resources for the research community.

While the idea of a large and representative dataset is appealing, a challenge with data collected over multiple imaging sites is the variability in the resulting images that is not due to subject differences, but simply due to the scanner and imaging parameters—i.e., increased noise that could swamp more subtle disease-specific effects. Planned studies with tightly controlled protocols minimize this variability, giving the best chance for identifying smaller individual differences [53]. A good example is the ADNI study previously mentioned, a large and carefully planned multi-site study of subjects with Alzheimer’s Disease (AD), subjects with Mild Cognitive Impairment, and healthy controls. Their methods have allowed them to identify clusters of pre-diagnosed subjects with different prognoses, some of whom are more likely to convert to full AD than are others [54].

Studies with less controlled designs must work with the data they can access, which entails only identifying variables with effects that are robust to the sources of imaging or clinical data collection heterogeneity. In combining common variables across legacy data, the more opportunistic studies often cannot benefit from the deep phenotyping that can make analyses like ADNI’s more rich. However, planned studies often do not collect broadly useful measures either, as noted previously. They focus on the hypotheses they were funded to study, and often do not have additional information about the subjects that would make the data re-usable for another question; in contrast, institutional approaches can leverage that breadth. Through minimal standard imaging protocols and planned data sharing approaches, datasets with consistent imaging methods and a wide array of clinical measures can be potentially aggregated for data mining.

The rise of these large scale studies, hand-in-hand with the recognized emphasis on sharing the resulting data, has also provided numerous data repositories and an increased awareness of the data’s value [55, 56]. MRI data repositories that are open to the research community are funded by the National Institutes for Health (NIH), individual institutions, or individual laboratories (for example [5761]). However, the current efforts in data sharing are often hampered by the lack of standardization not only in what is collected, but also how it is described. Data integration and mediation is an ongoing challenge that is a large part of the field of neuroinformatics (see e.g., [6266]). The data are not necessarily compatible when combined across different sources, with many missing or questionable data points.

A primary challenge, besides the noisiness of the data collection methods and the ability to find datasets others have already collected, is the science of working with”big data”. What questions can be asked given the data that has already been collected and made available? Given one’s scientific question; could the hypothesis be tested in available data, rather than designing a new study from scratch? How does one handle the noise, uncertainty and missing data? This requires the next generation of neuropsychiatric researchers to understand that these big datasets exist; how to use the neuroinformatics tools and methods to find them, as well as the best practices for aggregating the data or performing meta-analyses while addressing the inescapable sources of variance.


Large-scale neuroimaging studies of varying designs have been increasingly applied to neuropsychiatric research. The studies vary from completely controlled data collection and analysis, to post hoc meta-analyses with no control over those experimental parameters. Each category of experimental design has its strengths and weaknesses in its ability to address sources of variation, and its ability to identify subtle effects of interest.

Successful data integration and mediation will make the re-use of these datasets more viable and valuable. An imaging dataset of 20 subjects can provide a few findings, but an underpowered study has an increased risk of inflating its estimates of effect size, leading to a lack of reproducibility [67]. But, in conjunction with 10 or 100 more studies of similar size and type, it can reliably help address questions of clinical importance about symptom variations, prognosis or genetic influences. There were 12,000 papers published in English in 2012 as found in PubMed using the query “((human brain mapping) OR (fMRI) AND (brain AND MRI)”. Even if only one-third of them represent unique imaging datasets, there are clearly a plethora of imaging datasets of the human brain in various states that could be shared, reused or aggregated for novel analyses.

Training in experimental psychology and cognitive neuroscience often focuses on the details of experimental design for de novo data collection and analysis. However, while good experimental design is key, de novo data collection need not be. Neuroimaging researchers need to take a page from the sciences of climatology and geology, from economists and others who cannot always manipulate the environment in a precisely controlled manner to test their models. We are now at a point in the neuroimaging domain where neuroimaging researchers should first ask whether their question can be refined or even answered in the agglomeration of data previous researchers have collected. An even stronger approach would be to consider, when collecting new data, not only how to use existing data to supplement the proposed data collection, but how the new data could be used by others in the future, and how best to design the experiments and resource allocation for the project to facilitate that re-use. This is, in effect, combining computational and semantic web methods with statistical methods, for a “big data” approach to available neuroimaging data.

Author information

Dr. Turner has been working with MRI studies since 1998, and with multi-site imaging of schizophrenia since joining the FBIRN study in 2003 as the project manager, as well as participating in the MCIC and COBRE studies, the first phase of ADNI, and other multi-site clinical imaging studies. Her research encompasses brain correlates of different psychological states, and the genetic influences underlying schizophrenia in particular. She is committed to neuroimaging data sharing, developing the Cognitive Paradigm Ontology, chairing the ENIGMA Schizophrenia Working Group, and participating in the International Neuroinformatics Coordinating Facility’s Neuroimaging Data Sharing Task Force. She is currently an Associate Professor in the Department of Psychology and Neuroscience Institute at Georgia State University, Atlanta.



Autism brain imaging data exchange


Alzheimer’s disease


Alzheimer’s disease neuroimaging initiative


Brain imaging genetics project


Center of Biomedical Research Excellence


Consortium for Reliability and Reproducibility


Computed tomography


Enhancing Neuro Imaging Genetics through Meta Analysis


Functional Biomedical Informatics Research Network


Functional magnetic resonance imaging


Genetics of brain structure


Multi-site Clinical Imaging ConsortiumNIH: National Institutes of Health


Traumatic brain injury


Thematically Organized Psychosis.


  1. Desmond JE, Glover GH: Estimating sample size in functional MRI (fMRI) neuroimaging studies: statistical power analyses. J Neurosci Methods. 2002, 118: 115-128. 10.1016/S0165-0270(02)00121-8.

    Article  PubMed  Google Scholar 

  2. Friston KJ, Kahan J, Biswal B, Razi A: A DCM for resting state fMRI. Neuroimage. 2014, 94: 396-407.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Allen EA, Erhardt EB, Damaraju E, Gruner W, Segall JM, Silva RF, Havlicek M, Rachakonda S, Fries J, Kalyanam R, Michael AM, Caprihan A, Turner JA, Eichele T, Adelsheim S, Bryan AD, Bustillo J, Clark VP, Feldstein Ewing SW, Filbey F, Ford CC, Hutchison K, Jung RE, Kiehl KA, Kodituwakku P, Komesu YM, Mayer AR, Pearlson GD, Phillips JP, Sadek JR: A baseline for the multivariate comparison of resting-state networks. Front Syst Neurosci. 2011, 5: 2.

    PubMed  PubMed Central  Google Scholar 

  4. Stein JL, Medland SE, Vasquez AA, Hibar DP, Senstad RE, Winkler AM, Toro R, Appel K, Bartecek R, Bergmann O, Bernard M, Brown AA, Cannon DM, Chakravarty MM, Christoforou A, Domin M, Grimm O, Hollinshead M, Holmes AJ, Homuth G, Hottenga JJ, Langan C, Lopez LM, Hansell NK, Hwang KS, Kim S, Laje G, Lee PH, Liu X, Loth E: Identification of common variants associated with human hippocampal and intracranial volumes. Nat Genet. 2012, 44: 552-561. 10.1038/ng.2250.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Choudhury S, Fishman JR, McGowan ML, Juengst ET: Big data, open science and the brain: lessons learned from genomics. Front Hum Neurosci. 2014, 8: 239.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Jardri R, Pouchet A, Pins D, Thomas P: Cortical activations during auditory verbal hallucinations in schizophrenia: a coordinate-based meta-analysis. Am J Psychiatry. 2011, 168: 73-81. 10.1176/appi.ajp.2010.09101522.

    Article  PubMed  Google Scholar 

  7. Sommer IE, Diederen KM, Blom JD, Willems A, Kushan L, Slotema K, Boks MP, Daalman K, Hoek HW, Neggers SF, Kahn RS: Auditory verbal hallucinations predominantly activate the right inferior frontal area. Brain. 2008, 131: 3169-3177. 10.1093/brain/awn251.

    Article  PubMed  Google Scholar 

  8. Jardri R: Functional MRI to define rTMS targets in the case of complex multisensory hallucinations. Eur Arch Psychiatry Clin Neurosci. 2009, 259 (Suppl 1): S3-S105.

    Google Scholar 

  9. Potkin SG, Turner JA, Brown GG, McCarthy G, Greve DN, Glover GH, Manoach DS, Belger A, Diaz M, Wible CG, Ford JM, Mathalon DH, Gollub R, Lauriello J, O’Leary D, van Erp TG, Toga AW, Preda A, Lim KO, Fbirn: Working memory and DLPFC inefficiency in schizophrenia: the FBIRN study. Schizophr Bull. 2009, 35: 19-31. 10.1093/schbul/sbn162.

    Article  CAS  PubMed  Google Scholar 

  10. Potkin SG, Ford JM: Widespread cortical dysfunction in schizophrenia: the FBIRN imaging consortium. Schizophr Bull. 2009, 35: 15-18. 10.1093/schbul/sbn159.

    Article  PubMed  Google Scholar 

  11. Ford JM, Roach BJ, Jorgensen KW, Turner JA, Brown GG, Notestine R, Bischoff-Grethe A, Greve D, Wible C, Lauriello J, Belger A, Mueller BA, Calhoun V, Preda A, Keator D, O’Leary DS, Lim KO, Glover G, Potkin SG, Mathalon DH, Fbirn: Tuning in to the voices: a multisite FMRI study of auditory hallucinations. Schizophr Bull. 2009, 35: 58-66. 10.1093/schbul/sbn140.

    Article  PubMed  Google Scholar 

  12. Misra C, Fan Y, Davatzikos C: Baseline and longitudinal patterns of brain atrophy in MCI patients, and their use in prediction of short-term conversion to AD: results from ADNI. Neuroimage. 2009, 44: 1415-1422. 10.1016/j.neuroimage.2008.10.031.

    Article  PubMed  Google Scholar 

  13. Potkin SG, Turner JA, Guffanti G, Lakatos A, Fallon JH, Nguyen DD, Mathalon D, Ford J, Lauriello J, Macciardi F: Fbirn: A genome-wide association study of schizophrenia using brain activation as a quantitative phenotype. Schizophr Bull. 2009, 35: 96-108. 10.1093/schbul/sbn155.

    Article  PubMed  Google Scholar 

  14. Potkin SG, Turner JA, Fallon JA, Lakatos A, Keator DB, Guffanti G, Macciardi F: Gene discovery through imaging genetics: identification of two novel genes associated with schizophrenia. Mol Psychiatry. 2009, 14: 416-428. 10.1038/mp.2008.127.

    Article  CAS  PubMed  Google Scholar 

  15. Chen J, Calhoun VD, Pearlson GD, Ehrlich S, Turner JA, Ho BC, Wassink TH, Michael AM, Liu J: Multifaceted genomic risk for brain function in schizophrenia. Neuroimage. 2012, 61: 866-875. 10.1016/j.neuroimage.2012.03.022.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Egan MF, Goldberg TE, Kolachana BS, Callicott JH, Mazzanti CM, Straub RE, Goldman D, Weinberger DR: Effect of COMT Val108/158 Met genotype on frontal lobe function and risk for schizophrenia. Proc Natl Acad Sci U S A. 2001, 98: 6917-6922. 10.1073/pnas.111134598.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Sui J, He H, Yu Q, Chen J, Rogers J, Pearlson GD, Mayer A, Bustillo J, Canive J, Calhoun VD: Combination of Resting State fMRI, DTI, and sMRI Data to Discriminate Schizophrenia by N-way MCCA + jICA. Front Hum Neurosci. 2013, 7: 235.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Chouinard-Decorte F, McKay DR, Reid A, Khundrakpam B, Zhao L, Karama S, Rioux P, Sprooten E, Knowles E, Kent JW, Curran JE, Goring HH, Dyer TD, Olvera RL, Kochunov P, Duggirala R, Fox PT, Almasy L, Blangero J, Bellec P, Evans AC, Glahn DC: Heritable changes in regional cortical thickness with age. Brain Imag Behav. 2014, 8: 208-216. 10.1007/s11682-014-9296-x.

    Article  Google Scholar 

  19. Satterthwaite TD, Elliott MA, Ruparel K, Loughead J, Prabhakaran K, Calkins ME, Hopson R, Jackson C, Keefe J, Riley M, Mentch FD, Sleiman P, Verma R, Davatzikos C, Hakonarson H, Gur RC, Gur RE: Neuroimaging of the Philadelphia neurodevelopmental cohort. Neuroimage. 2014, 86: 544-553.

    Article  PubMed  Google Scholar 

  20. Gollub RL, Shoemaker JM, King MD, White T, Ehrlich S, Sponheim SR, Clark VP, Turner JA, Mueller BA, Magnotta V, O’Leary D, Ho BC, Brauns S, Manoach DS, Seidman L, Bustillo JR, Lauriello J, Bockholt J, Lim KO, Rosen BR, Schulz SC, Calhoun VD, Andreasen NC: The MCIC collection: a shared repository of multi-modal, multi-site brain image data from a clinical investigation of schizophrenia. Neuroinformatics. 2013, 11: 367-388. 10.1007/s12021-013-9184-3.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Mueller SG, Weiner MW, Thal LJ, Petersen RC, Jack CR, Jagust W, Trojanowski JQ, Toga AW, Beckett L: Ways toward an early diagnosis in Alzheimer’s disease: the Alzheimer’s Disease Neuroimaging Initiative (ADNI). Alzheim Dement. 2005, 1: 55-66. 10.1016/j.jalz.2005.06.003.

    Article  Google Scholar 

  22. Hua X, Hibar DP, Lee S, Toga AW, Jack CR, Weiner MW, Thompson PM, Alzheimer’s Disease Neuroimaging I: Sex and age differences in atrophic rates: an ADNI study with n = 1368 MRI scans. Neurobiol Aging. 2010, 31: 1463-1480. 10.1016/j.neurobiolaging.2010.04.033.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Ross CA, Aylward EH, Wild EJ, Langbehn DR, Long JD, Warner JH, Scahill RI, Leavitt BR, Stout JC, Paulsen JS, Reilmann R, Unschuld PG, Wexler A, Margolis RL, Tabrizi SJ: Huntington disease: natural history, biomarkers and prospects for therapeutics. Nat Rev Neurol. 2014, 10: 204-216. 10.1038/nrneurol.2014.24.

    Article  CAS  PubMed  Google Scholar 

  24. Joyner AH JCR, Bloss CS, Bakken TE, Rimol LM, Melle I, Agartz I, Djurovic S, Topol EJ, Schork NJ, Andreassen OA, Dale AM: A common MECP2 haplotype associates with reduced cortical surface area in humans in two independent populations. Proc Natl Acad Sci U S A. 2009, 106: 15483-15488. 10.1073/pnas.0901866106.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Schumann G, Loth E, Banaschewski T, Barbot A, Barker G, Buchel C, Conrod PJ, Dalley JW, Flor H, Gallinat J, Garavan H, Heinz A, Itterman B, Lathrop M, Mallik C, Mann K, Martinot JL, Paus T, Poline JB, Robbins TW, Rietschel M, Reed L, Smolka M, Spanagel R, Speiser C, Stephens DN, Strohle A, Struve M, consortium I: The IMAGEN study: reinforcement-related behaviour in normal brain function and psychopathology. Mol Psychiatry. 2010, 15: 1128-1139. 10.1038/mp.2010.4.

    Article  CAS  PubMed  Google Scholar 

  26. Glover GH, Mueller BA, Turner JA, van Erp TG, Liu TT, Greve DN, Voyvodic JT, Rasmussen J, Brown GG, Keator DB, Calhoun VD, Lee HJ, Ford JM, Mathalon DH, Diaz M, O’Leary DS, Gadde S, Preda A, Lim KO, Wible CG, Stern HS, Belger A, McCarthy G, Ozyurt B, Potkin SG: Function biomedical informatics research network recommendations for prospective multicenter functional MRI studies. J Magn Reson Imag. 2012, 36: 39-54. 10.1002/jmri.23572.

    Article  Google Scholar 

  27. Stocker T, Schneider F, Klein M, Habel U, Kellermann T, Zilles K, Shah NJ: Automated quality assurance routines for fMRI data applied to a multicenter study. Hum Brain Mapp. 2005, 25: 237-246. 10.1002/hbm.20096.

    Article  PubMed  Google Scholar 

  28. Biswal BB, Mennes M, Zuo XN, Gohel S, Kelly C, Smith SM, Beckmann CF, Adelstein JS, Buckner RL, Colcombe S, Dogonowski AM, Ernst M, Fair D, Hampson M, Hoptman MJ, Hyde JS, Kiviniemi VJ, Kotter R, Li SJ, Lin CP, Lowe MJ, Mackay C, Madden DJ, Madsen KH, Margulies DS, Mayberg HS, McMahon K, Monk CS, Mostofsky SH, Nagel BJ: Toward discovery science of human brain function. Proc Natl Acad Sci U S A. 2010, 107: 4734-4739. 10.1073/pnas.0911855107.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Di Martino A, Yan CG, Li Q, Denio E, Castellanos FX, Alaerts K, Anderson JS, Assaf M, Bookheimer SY, Dapretto M, Deen B, Delmonte S, Dinstein I, Ertl-Wagner B, Fair DA, Gallagher L, Kennedy DP, Keown CL, Keysers C, Lainhart JE, Lord C, Luna B, Menon V, Minshew NJ, Monk CS, Mueller S, Muller RA, Nebel MB, Nigg JT, O’Hearn K: The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Mol Psychiatry. 2014, 19: 659-667. 10.1038/mp.2013.78.

    Article  CAS  PubMed  Google Scholar 

  30. Cota N, Rachakonda S, Calhoun VD, Turner JA: Application of Source Based Morphometry for an Aggregated/Multisite Gray matter dataset. International Congress on Schizophrenia Research (ICOSR). 2013, Grande Lakes, Florida: Oxford University Press, S179.

    Google Scholar 

  31. Cota NG, Calhoun VD, Rachakonda S, Chen J, Liu J, Segall J, Franke B, Zwiers M, Arias-Vasquez A, Buitelaar J, Fisher SE, Fernandez G, van Erp TG, Potkin SG, Ford JM, Mathalon DH, McEwen S, Lee HJ, Mueller BA, Greve DN, Andreassen OA, Agartz I, Gollub RL, Sponheim SR, Ehrlich S, Wang L, Pearlson GD, Glahn DC, Sprooten D, Mayer AR: Patterns of gray matter abnormalities in schizophrenia based on an international mega-analysis. Schizophr Res. In press

  32. Consortium for Reliability and Reproducibility (CoRR).

  33. King MD, Wood D, Miller B, Kelly R, Landis D, Courtney W, Wang R, Turner JA, Calhoun VD: Automated collection of imaging and phenotypic data to centralized and distributed data repositories. Front Neuroinformatics. 2014, 8: 60.

    Article  Google Scholar 

  34. Chen J, Liu J, Calhoun VD, Arias-Vasquez A, Zwiers MP, Gupta CN, Franke B, Turner JA: Exploration of scanning effects in multi-site structural MRI studies. J Neurosci Methods. 2014, 230: 37-50.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Chervenak AL, Van Erp TG, Kesselman C, D’Arcy M, Sobell J, Keator D, Dahm L, Murry J, Law M, Hasso A: A System Architecture for Sharing De-Identified, Research-Ready Brain Scans and Health Information Across Clinical Imaging Centers. Healthgrid Applications and Technologies Meet Science Gateways for Life Sciences (Proceedings of the HealthGrid 2012 Conference)). Edited by: Gesing S. 2012, Amsterdam, Netherlands: Ios Press, 19-28.

    Google Scholar 

  36. One Mind for Research.,

  37. Laird AR, Eickhoff SB, Kurth F, Fox PM, Uecker AM, Turner JA, Robinson JL, Lancaster JL, Fox PT: ALE meta-analysis workflows via the brainmap database: progress towards a probabilistic functional brain Atlas. Front Neuroinformatics. 2009, 3: 23.

    Article  PubMed Central  Google Scholar 

  38. Eickhoff SB, Laird AR, Grefkes C, Wang LE, Zilles K, Fox PT: Coordinate-based activation likelihood estimation meta-analysis of neuroimaging data: a random-effects approach based on empirical estimates of spatial uncertainty. Hum Brain Mapp. 2009, 30: 2907-2926. 10.1002/hbm.20718.

    Article  PubMed  PubMed Central  Google Scholar 


  40. Ford JM, Palzes VA, Roach BJ, Potkin SG, van Erp TG, Turner JA, Mueller BA, Calhoun VD, Voyvodic J, Belger A, Bustillo J, Vaidya JG, Preda A, McEwen SC, Functional Imaging Biomedical Informatics Research N, Mathalon DH: Visual Hallucinations Are Associated With Hyperconnectivity Between the Amygdala and Visual Cortex in People With a Diagnosis of Schizophrenia. Schizophrenia Bull. 2014

    Google Scholar 

  41. Turner JA, Damaraju E, van Erp TGM, Mathalon DH, Ford JM, Voyvodic J, Mueller BA, Belger A, Bustillo J, McEwen SC, Potkin SG, Calhoun VD, Fbirn: A multi-site resting state fMRI study on the amplitude of low frequency fluctuations in schizophrenia. Front Neurosci. 2013, 7: 137.

    PubMed  PubMed Central  Google Scholar 

  42. Ghersi D, Berlin J, Askie L: Cochrane Handbook for Systematic Reviews of Intervention Version 510 (updated March 2011). Edited by: Higgins JPT, Green S. 2011, Chapter 19: Prospective meta-analysis, The Cochrane Collection.

    Google Scholar 

  43. Jenkinson M, Beckmann CF, Behrens TE, Woolrich MW, Smith SM: Fsl. Neuroimage. 2012, 62: 782-790. 10.1016/j.neuroimage.2011.09.015.

    Article  PubMed  Google Scholar 

  44. Fischl B, Salat DH, Busa E, Albert M, Dieterich M, Haselgrove C, van der Kouwe A, Killiany R, Kennedy D, Klaveness S, Montillo A, Makris N, Rosen B, Dale AM: Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron. 2002, 33: 341-355. 10.1016/S0896-6273(02)00569-X.

    Article  CAS  PubMed  Google Scholar 

  45. R software.

  46. Thompson PM, Stein JL, Medland SE, Hibar DP, Vasquez AA, Renteria ME, Toro R, Jahanshad N, Schumann G, Franke B, Wright MJ, Martin NG, Agartz I, Alda M, Alhusaini S, Almasy L, Almeida J, Alpert K, Andreasen NC, Andreassen OA, Apostolova LG, Appel K, Armstrong NJ, Aribisala B, Bastin ME, Bauer M, Bearden CE, Bergmann O, Binder EB, Blangero J: The ENIGMA Consortium: large-scale collaborative analyses of neuroimaging and genetic data. Brain Imag Behav. 2014, 8: 153-182.

    Google Scholar 

  47. Enhancing Neuro Imaging Genetics through MetaAnalysis.,

  48. Alaerts K, Woolley DG, Steyaert J, Di Martino A, Swinnen SP, Wenderoth N: Underconnectivity of the superior temporal sulcus predicts emotion recognition deficits in autism. Soc Cognit Affect Neurosci. 2013

    Google Scholar 

  49. Damaraju E, Allen EA, Belger A, Ford JM, Mathalon DH, McEwen S, Mueller BA, Pearlson GD, Potkin SG, Preda A, Turner JA, Vaidya JG, van Erp TG, Calhoun VD: Dynamic functional connectivity analysis reveals transient states of dysconnectivity in schizophrenia. Neuroimaging. 2014, 5: 298-308.

    CAS  Google Scholar 

  50. Arbabshirani MR, Damaraju E, Phlypo R, Plis S, Allen E, Ma S, Mathalon D, Preda A, Vaidya JG, Adali T, Calhoun VD: Impact of autocorrelation on functional connectivity. Neuroimage. 2014, 102P2: 294-308.

    Article  Google Scholar 

  51. Consortium HD: The ADHD-200 Consortium: a model to advance the translational potential of neuroimaging in clinical neuroscience. Front Syst Neurosci. 2012, 6: 62.

    Article  Google Scholar 

  52. The COBRE Challenge.,

  53. Castellanos FX, Di Martino A, Craddock RC, Mehta AD, Milham MP: Clinical applications of the functional connectome. Neuroimage. 2013, 80: 527-540.

    Article  CAS  PubMed  Google Scholar 

  54. Nettiksimmons J, Decarli C, Landau S, Beckett L, Alzheimer’s Disease Neuroimaging I: Biological heterogeneity in ADNI amnestic mild cognitive impairment. Alzheimer’s Dement. 2014

    Google Scholar 

  55. Turner JA, Van Horn JD: Electronic data capture, representation, and applications for neuroimaging. Front Neuroinformatics. 2012, 6: 16.

    Article  Google Scholar 

  56. Poldrack RA: The future of fMRI in cognitive neuroscience. Neuroimage. 2012, 62: 1216-1220. 10.1016/j.neuroimage.2011.08.007.

    Article  PubMed  Google Scholar 

  57. National Database for Autism Research.,

  58. COINS Data Exchange.,

  59. Pediatric Imaging, Neurocognition, and Genetics (PING). []

  60. Open fMRI.,

  61. XNAT Central.,

  62. Akil H, Martone ME, Van Essen DC: Challenges and opportunities in mining neuroscience data. Science. 2011, 331: 708-712. 10.1126/science.1199305.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Gupta A, Bug W, Marenco L, Qian X, Condit C, Rangarajan A, Muller HM, Miller PL, Sanders B, Grethe JS, Astakhov V, Shepherd G, Sternberg PW, Martone ME: Federated access to heterogeneous information resources in the Neuroscience Information Framework (NIF). Neuroinformatics. 2008, 6: 205-217. 10.1007/s12021-008-9033-y.

    Article  PubMed  PubMed Central  Google Scholar 

  64. Herrick R, McKay M, Olsen T, Horton W, Florida M, Moore CJ, Marcus DS: Data dictionary services in XNAT and the Human Connectome Project. Front Neuroinformatics. 2014, 8: 65.

    Article  Google Scholar 

  65. Ashish N, Ambite JL, Muslea M, Turner JA: Neuroscience data integration with mediation: an (F)BIRN application and case study. Front Neuroinformatics. 2010, 4: 118.

    Article  PubMed Central  Google Scholar 

  66. Keator DB, Helmer K, Steffener J, Turner JA, Van Erp TG, Gadde S, Ashish N, Burns GA, Nichols BN: Towards structured sharing of raw and derived neuroimaging data across existing resources. Neuroimage. 2013, 82: 647-661.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Button KS, Ioannidis JP, Mokrysz C, Nosek BA, Flint J, Robinson ES, Munafo MR: Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013, 14: 365-376. 10.1038/nrn3475.

    Article  CAS  PubMed  Google Scholar 

Download references


The author acknowledges Dr. David Glahn for his role in spurring the development of this review, and Dr. Steven Potkin for pushing the idea of multi-site fMRI studies at the turn of the century. JT was supported by NIH (NIMH) 5R01MH094524.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jessica A Turner.

Additional information

Competing interests

The author declares that she has no competing interests.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Turner, J.A. The rise of large-scale imaging studies in psychiatry. GigaSci 3, 29 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: