Skull-stripping is the procedure of removing non-brain tissue from anatomical MRI data. This procedure can be useful for calculating brain volume and for improving the quality of other image processing steps. Developing new skull-stripping algorithms and evaluating their performance requires gold standard data from a variety of different scanners and acquisition methods. We complement existing repositories with manually corrected brain masks for 125 T1-weighted anatomical scans from the Nathan Kline Institute Enhanced Rockland Sample Neurofeedback Study.
Skull-stripped images were obtained using a semi-automated procedure that involved skull-stripping the data using the brain extraction based on nonlocal segmentation technique (BEaST) software, and manually correcting the worst results. Corrected brain masks were added into the BEaST library and the procedure was repeated until acceptable brain masks were available for all images. In total, 85 of the skull-stripped images were hand-edited and 40 were deemed to not need editing. The results are brain masks for the 125 images along with a BEaST library for automatically skull-stripping other data.
Skull-stripped anatomical images from the Neurofeedback sample are available for download from the Preprocessed Connectomes Project. The resulting brain masks can be used by researchers to improve preprocessing of the Neurofeedback data, as training and testing data for developing new skull-stripping algorithms, and for evaluating the impact on other aspects of MRI preprocessing. We have illustrated the utility of these data as a reference for comparing various automatic methods and evaluated the performance of the newly created library on independent data.
One of the many challenges facing the analysis of magnetic resonance imaging (MRI) data is achieving accurate brain extraction from the data. Brain extraction, also known as skull-stripping, aims to remove all non-brain tissue from an image. This is commonly a preliminary step in preprocessing and the quality of its result affects the subsequent steps, such as image registration and brain matter segmentation. Many challenges surround the process of brain extraction. The manual creation and correction of brain masks is tedious, time-consuming, and susceptible to experimenter bias. On the other hand, fully automated brain extraction is not a simple image segmentation problem. Brains differ in orientation and morphology, especially pediatric, geriatric, and pathological brains. In addition, non-brain tissue may resemble brain in terms of voxel intensity. Differences in MRI scanner, acquisition sequence, and scan parameters can also have an effect on automated algorithms due to differences in image contrast, quality, and orientation. Image segmentation techniques with low computational time, high accuracy, and high flexibility are extremely desirable.
Developing new automated skull-stripping methods, and comparing these with existing methods, requires large quantities of gold standard skull-stripped data acquired from a variety of scanners using a variety of sequences and parameters. This is due to the variation in performance of algorithms using different MRI data. Repositories containing gold standard skull-stripped data already exist: the Alzheimer’s Disease Neuroimaging Initiative (ADNI) ; BrainWeb: Simulated Brain Database (SBD) ; the Internet Brain Segmentation Repository (IBSR) at the Center for Morphometric Analysis ; the LONI Probabilistic Brain Atlas (LPBA40) at the UCLA Laboratory of Neuro Imaging ; and the Open Access Series of Imaging Studies (OASIS) , the last of which is not manually delineated but has been used as gold standard data [6, 7]. We extend and complement these existing repositories by releasing manually corrected skull strips for 125 individuals from the Nathan Kline Institute (NKI) Enhanced Rockland Sample Neurofeedback Study (NFB). These are the first 125 participants who finished the entire 3-day protocol, consented to have their data shared, and were not excluded from data sharing for having an incidental finding during neuroradiological review.
The repository was constructed from defaced and anonymized anatomical data downloaded from the NFB . The NFB is a 3-visit study that involves a deep phenotypic assessment on the first and second visits, a 1-h connectomic MRI scan on the second visit, and a 1-h neurofeedback scan on the last visit. Up to 3 months may have passed between the first and last visits. The 125 participants included 77 females and 48 males in the 21–45 age range (average: 31, standard deviation: 6.6).
Consistent with the the Research Domain Criteria (RDoC) , the goal of the NFB study is to examine default network regulation across a range of clinical and subclinical psychiatric symptoms. To preserve this variance, while being representative of the general population, a community-ascertained sample was recruited with minimally restrictive psychiatric exclusion criteria . Only the most severe illnesses were screened out, excluding those who were unable to comply with instructions, tolerate the MRI, and participate in the extensive phenotyping protocol. As a result, 66 of the participants had one or more current or past psychiatric diagnosis as determined by the structured clinical interview for the DSM-IV (SCID)  (see Table 1). No brain abnormalities or incidental findings were present in the images, as determined by a board-certified neuroradiologist. None of the participants had any other major medical condition such as cancer or AIDS.
Anatomical MRI data from the third visit of the NFB protocol were used to build the Neurofeedback Skull-stripped (NFBS) repository. MRI data were collected on a 3 T Siemens Magnetom TIM Trio scanner (Siemens Medical Solutions USA: Malvern PA, USA) using a 12-channel head coil. Anatomical images were acquired at 1×1×1 mm3 resolution with a 3D T1-weighted magnetization-prepared rapid acquisition gradient-echo (MPRAGE)  sequence in 192 sagittal partitions each with a 256×256 mm2 field of view (FOV), 2600 ms repetition time (TR), 3.02 ms echo time (TE), 900 ms inversion time (TI), 8° flip angle (FA), and generalized auto-calibrating partially parallel acquisition (GRAPPA) acceleration  factor of 2 with 32 reference lines. Anatomical data were acquired immediately after a fast localizer scan and preceded the collection of a variety of other scans , whose description is beyond the scope of this report.
Brain mask definition
Many researchers differ on the standard for what to include and exclude from the brain. Some brain extraction methods, such as brainwash, include the dura mater in the brain mask to use as a reference for measurements . The standard we used was adapted from Eskildsen et al. (2012) . Non-brain tissue is defined as skin, skull, eyes, dura mater, external blood vessels and nerves (e.g., optic chiasm, superior sagittal sinus, and transverse sinus). Cerebrum, cerebellum, brainstem, and internal vessels and arteries are included in the brain, along with cerebrospinal fluid (CSF) in ventricles, internal cisterns, and deep sulci.
NFBS repository construction
The BEaST method (brain extraction based on nonlocal segmentation technique) was used to initially skull-strip the 125 anatomical T1-weighted images . This software uses a patch-based label fusion method that labels each voxel in the brain boundary volume by comparing it to similar locations in a library of segmented priors. The segmentation technique also incorporates a multi-resolution framework in order to reduce computational time. The version of BEaST used was 1.15.00 and our implementation was based on a shell script written by Qingyang Li . The standard parameters were used in the configuration files and beast-library-1.1 (which contains data from 10 young individuals) was used for the initial skull-strip of the data. Before running mincbeast, the main segmentation script of BEaST, the anatomical images were normalized using the beast_normalize script. mincbeast was run using the probability filter setting, which smoothed the manual edits, and the fill setting, which filled any holes in the masks. The failure rate for masks using BEaST was similar to that of the published rate of approximately 29 % . Visual inspection of these initial skull-stripped images indicated whether additional edits were necessary.
Manual edits were performed using the Freeview visualization tool from the FreeSurfer software package . The anatomical image was loaded as a track volume and the brain mask was loaded as a volume. The voxel edit mode was then used to include or exclude voxels in the mask. As previously mentioned, all exterior non-brain tissue was removed from the head image, specifically the skull, scalp, fat, muscle, dura mater, and external blood vessels and nerves (see Fig. 1). Time spent editing each mask ranged from 1–8 h, depending on the quality of the anatomical image and the BEaST mask. Afterwards, manually edited masks were used create a NFB specific prior library for BEaST. This iterative bootstrapping technique was repeated until approximately 85 of the datasets were manually edited and all skull-strips were considered acceptable.
For each of the 125 subjects, the repository contains the de-faced and anonymized anatomical T1-weighted image, skull-stripped brain image, and brain mask. Each of these are in compressed NIfTI file format (.nii.gz). The size of the entire data set is around 1.9 GB. The BEaST library created using these images is also available.
The semi-automated skull-stripping procedure was repeated until all brain masks were determined to be acceptable by two raters (BP and ET). Once this was completed, the brain masks were used as gold standard data for comparing different automated skull-stripping algorithms. Additionally, we evaluated the performance of the newly created BEaST library by comparing it to other skull-stripping methods on data from the IBSR  and the LPBA40 .
Many skullstripping algorithms have been developed [6, 7, 14, 18–22], but we focused on FSL’s Brain Extraction Tool (BET) , AFNI’s 3dSkullStrip , and FreeSurfer’s Hybrid Watershed Algorithm (HWA)  based on their popularity.
BET is an algorithm incorporated in the FSL software that is based on a deformable model of the surface of the brain . First, an intensity histogram is used to find the center of gravity of the head. Then a tessellated sphere is initialized around the center of gravity and expanded by locally adaptive forces. The method can also incorporate T2-weighted images to isolate the inner and outer skull and scalp. The bias field and neck setting (bet -B) was used since the anatomical images contained the subjects’ necks. The version of FSL used was 5.0.7.
3dSkullStrip is a modified version of BET that is incorporated in the AFNI toolkit . The algorithm begins by preprocessing the image to correct for spatial variations in image intensity and repositioning the brain to roughly the center of the image. Then a modified algorithm based on BET is used to expand a mesh sphere until it envelops the entire brain surface. Among the modifications are procedures to avoid the eyes and ventricles and operations to avoid cutting into the brain. The version of the AFNI toolkit used was AFNI_2011_12_21_1014.
HWA is a hybrid technique that uses a watershed algorithm in combination with a deformable surface algorithm . The watershed algorithm is first used to create an initial mask under the assumption of the connectivity of white matter. Then a deformable surface model is used to incorporate geometric constraints into the mask. The version of FreeSurfer used was 5.3.0.
To illustrate the use of the NFBS as testing data, it was used to compare the performance of BET, 3dSkullStrip and HWA for automatically skull-stripping the original NFB data. In a second analysis we compared the performance of the NFBS BEaST library to the default BEaST library and the three aforementioned methods. Each of the methods was used to skull-strip data from the IBSR (version 2.0) and LPBA40 [3, 4]. To ensure consistent image orientation across methods and datasets, they were all converted to LPI orientation1 using AFNI’s 3dresample program . Additionally, a step function was applied to all of the outputs using AFNI’s 3dcalc tool to binarize all of the generated masks.
The performance of the various methods was compared using the Dice similarity  between the mask generated for an image and its corresponding reference (‘gold standard’) mask. Dice was calculated using: D=2·|A∩B|/(|A|+|B|), where A is the set of voxels in the test mask, B is the set of voxels in the gold standard data mask, A∩B is the intersection of A and B, and |·| is the number of voxels in a set. Dice was implemented in custom Python scripts that used the NiBabel neuroimaging package  for data input. Dice coefficients were subsequently graphed as box plots using the ggplot2 package  for the R statistical computing language .
Figure 2 displays box plots of the Dice coefficients that result from using NFBS as gold standard data. The results indicate that 3dSkullStrip performed significantly better than the two alternative methods, with HWA coming in second. In particular, average Dice similarity coefficients were 0.893 ± 0.027 for BET, 0.949 ± 0.009 for 3dSkullStrip, and 0.900 ± 0.011 for HWA. It is perhaps worth noting that BET, the method that performed worst on the NFBS library, took substantially more time to run (25 min) compared to 3dSkullStrip (2 min) and HWA (1 min).
Switching now from using NFBS as the repository of gold standard skull-stripped images to using the IBSR and LPBA40 repositories as the source of gold standard images, Fig. 3 shows box plots of the Dice similarity coefficients for BET, 3dSkullStrip, HWA, BEaST using beast-library-1.1, and BEaST using NFBS as the library of priors. For IBSR, 3dSkullStrip performs better than BET and HWA, similarly to NFBS. However, for LPBA40, BET performs much better than the other two algorithms. The BEaST method was also applied to the anatomical data in these repositories using two different methods: first with the original beast-library-1.1 set as the prior library, and second with the entire NFBS set as the prior library.
For the BEaST method, using NFBS as the prior library resulted in higher average Dice similarity coefficients and smaller standard deviations2. Differences in Dice coefficients between datasets may be due the size and quality of the NFB study, as well as the pathology and age of the participants. In particular, the NFBS library of priors reflects a much wider range of individuals than does beast-library-1.1, which only contains 10 young individuals. There also may be differences in the standard of the masks, such as length of brainstem and inclusion of exterior nerves and sinuses.
Placing our results in the context of other skull-stripping comparisons, differences between the Dice coefficients reported here and values already published in the literature may be due to the version and implementation of the skull stripping algorithms, a possibility that has received support in the literature . These differences may also result from our application of AFNI’s 3dcalc step function to the skull-stripped images in order to get a value determined more by brain tissue and less influences by CSF. As the NFBS dataset is freely accessible by members of the neuroimaging community, these possibilities may be investigated by the interested researcher.
Importance for the neuroimaging community
In summary, we have created and shared the NFBS repository of high quality, skull-stripped T1-weighted anatomical images that is notable for its quality, its heterogeneity, and its ease of access. The procedure used to populate the repository combined the automated, state-of-the-art BEaST algorithm with meticulous hand editing to correct any residual brain extraction errors noticed on visual inspection. The manually corrected brain masks will be a valuable resource for improving the quality of preprocessing obtainable on the NFB data. The corresponding BEaST library will improve skull-stripping of future NFB releases and may outperform the default beast-library-1.1 on other datasets (see Fig. 3). Additionally, the corrected brain masks may be used as gold standards for comparing alternative brain extraction algorithms, as was illustrated in our preliminary analysis (see Fig. 2).
The NFBS repository is larger and more heterogeneous than many comparable datasets. It contains 125 skull-stripped images, is composed of images from individuals with ages ranging from 21–45, and represents individuals diagnosed with a wide range of psychiatric disorders (see Table 1). This variation is a crucial feature of NFBS, as it accounts for more than the average brain. Ultimately, this variation may prove useful for researchers interested in developing and evaluating predictive machine learning algorithms on both normal populations and those with brain disorders .
Finally, the repository is completely open to the neuroscience community. NFBS contains no sensitive personal health information, so researchers interested in using it may do so without submitting an application or signing a data usage agreement. This is in contrast to datasets such as the one collected by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) . Researchers can use ADNI to develop and test skull-stripping algorithms , but in order to do so must first apply and sign a data usage agreement, which bars them from distributing the results of their efforts. Thus, we feel that NFBS has the potential to accelerate the pace of discovery in the field, a view that resonates with perspectives on the importance of making neuroimaging repositories easy to access and easy to use .
1 This refers to the manner in which the 3D image data are saved in the file. With LPI orientation, the voxel at memory location (0,0,0) is located at the leftmost, posterior, inferior voxel in the image. As the indices increase, they scan the voxels from left-to-right, along lines that advance from posterior-to-anterior, and planes that advance from inferior-to-superior. Additional details concerning the orientation of MRI images are available online .
2 BEaST was unable to segment 1 subject, IBSR_11, in IBSR, only when using beast-library-1.1. For LPBA40, BEaST was also unable to segment 1 subject, S35, when using beast-library-1.1 and NFBS. These subjects were left out of the Dice calculations.
Alzheimer’s Disease Neuroimaging Initiative
brain extraction based on nonlocal segmentation technique
Brain extraction technique
Hybrid watershed technique
Internet brain segmentation repository
LONI Probabilistic Brain Atlas
Magnetic resonance imaging
Nathan Kline Institute
University of California, Los Angeles
Mueller SG, Weiner MW, Thal LJ, Petersen RC, Jack CR, Jagust W, Trojanowski JQ, Toga AW, Beckett L. Ways toward an early diagnosis in Alzheimer’s disease: the Alzheimer’s Disease Neuroimaging Initiative (ADNI). Alzheimers Dement. 2005; 1(1):55–66.
Shattuck DW, Mirza M, Adisetiyo V, Hojatkashani C, Salamon G, Narr KL, Poldrack RA, Bilder RM, Toga AW. Construction of a 3d probabilistic atlas of human cortical structures. Neuroimage. 2008; 39(3):1064–1080.
Marcus DS, Wang TH, Parker J, Csernansky JG, Morris JC, Buckner RL. Open Access Series of Imaging Studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci. 2007; 19(9):1498–1507.
Iglesias JE, Liu CY, Thompson PM, Tu Z. Robust brain extraction across datasets and comparison with publicly available methods. IEEE Trans Med Imaging. 2011; 30(9):1617–1634. doi:10.1109/TMI.2011.2138152.
Nooner KB, Colcombe S, Tobe R, Mennes M, Benedict M, Moreno A, Panek L, Brown S, Zavitz S, Li Q, et al. The NKI-Rockland Sample: a model for accelerating the pace of discovery science in psychiatry. Front Neurosci. 2012; 6:152.
Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, Sanislow C, Wang P. Research Domain Criteria (RDoC): toward a new classification framework for research on mental disorders. Am J Psychiatry. 2010; 167(7):748–51.
First MB, Spitzer RL, Gibbon M, Williams JB. Structured clinical interview for DSM-IV-TR Axis I disorders, research version, non-patient edition. (Technical report, SCID-I/NP). New York: New York State Psychiatric Institute Biometrics Research; 2002.
Wang Y, Nie J, Yap PT, Li G, Shi F, Geng X, Guo L, Shen D. Knowledge-guided robust MRI brain extraction for diverse large-scales neuroimaging studies on humans and non-human primates. PLoS ONE. 2014; 9(1):1–23. doi:10.1371/journal.pone.0077810.
Leung KK, Barnes J, Modat M, Ridgway GR, Bartlett JW, Fox NC, Ourselin S. Brain MAPS: an automated, accurate and robust brain extraction technique using a template library. NeuroImage. 2011; 55(3):1091–1108.
R Development Core Team. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing; 2008. R Foundation for Statistical Computing. ISBN 3-900051-07-0. http://www.R-project.org.
We would like to thank Dr. Simon Fristed Eskildsen for help with the installation and optimization of the BEaST method. We would also like to acknowledge Qingyang Li for creating the BEaST guide, as well as the Bash script that we based our script on. Lastly, we would like to thank all of those involved in the participation, data collection, and data sharing initiative of the Enhanced Rockland Sample.
This work was supported by R01MH101555 from the National Institute of Mental Health to RCC.
RCC designed the Neurofeedback study and Skull-stripped repository; BP and EST performed manual correction and validation of results; BP performed the validation analyses; BP, RCC, JSP, and JPP wrote the data note. All authors read and approved of the final version.
The authors declare that they have no competing interests.
Consent for publication
All participants consented to have their data shared.
Ethics approval and consent to participate
All experimental procedures were performed with approval of the Nathan S. Kline Institute for Psychiatric Research institutional review board and only after informed consent was obtained.
Authors and Affiliations
Computational Neuroimaging Lab, Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, Orangeburg, 10962, NY, USA
Benjamin Puccio, Elise C. Taverna & R. Cameron Craddock
Center for the Developing Brain, Child Mind Institute, 445 Park Ave, New York, 10022, NY, USA
James P. Pooley, John S. Pellman & R. Cameron Craddock
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.