Analysis and validation of automated skull stripping tools: A validation study based on 296 MR images from the Honolulu Asia aging study

S. W. Hartley, A. I. Scher*, E. S.C. Korf, L. R. White, L. J. Launer

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

39 Scopus citations


As population-based epidemiologic studies may acquire images from thousands of subjects, automated image post-processing is needed. However, error in these methods may be biased and related to subject characteristics relevant to the research question. Here, we compare two automated methods of brain extraction against manually segmented images and evaluate whether method accuracy is associated with subject demographic and health characteristics. MRI data (n = 296) are from the Honolulu Asia Aging Study, a population-based study of elderly Japanese-American men. The intracranial space was manually outlined on the axial proton density sequence by a single operator. The brain was extracted automatically using BET (Brain Extraction Tool) and BSE (Brain Surface Extractor) on axial proton density images. Total intracranial volume was calculated for the manually segmented images (ticvM), the BET segmented images (ticvBET) and the BSE segmented images (ticvBSE). Mean ticvBSE was closer to that of ticvM, but ticvBET was more highly correlated with ticvM than ticvBSE. BSE had significant over (positive error) and underestimated (negative error) ticv, but net error was relatively low. BET had large positive and very low negative error. Method accuracy, measured in percent positive and negative error, varied slightly with age, head circumference, presence of the apolipoprotein eε4 polymorphism, subcortical and cortical infracts and enlarged ventricles. This epidemiologic approach to the assessment of potential bias in image post-processing tasks shows both skull-stripping programs performed well in this large image dataset when compared to manually segmented images. Although method accuracy was statistically associated with some subject characteristics, the extent of the misclassification (in terms of percent of brain volume) was small.

Original languageEnglish
Pages (from-to)1179-1186
Number of pages8
Issue number4
StatePublished - 1 May 2006
Externally publishedYes


Dive into the research topics of 'Analysis and validation of automated skull stripping tools: A validation study based on 296 MR images from the Honolulu Asia aging study'. Together they form a unique fingerprint.

Cite this