Assessment of Machine Learning Detection of Environmental Enteropathy and Celiac Disease in Children

Sana Syed, Mohammad Al-Boni, Marium N. Khan, Kamran Sadiq, Najeeha T. Iqbal, Christopher A. Moskaluk, Paul Kelly, Beatrice Amadi, S. Asad Ali, Sean R. Moore, Donald E. Brown

Research output: Contribution to journalArticlepeer-review

29 Scopus citations


Importance: Duodenal biopsies from children with enteropathies associated with undernutrition, such as environmental enteropathy (EE) and celiac disease (CD), display significant histopathological overlap. Objective: To develop a convolutional neural network (CNN) to enhance the detection of pathologic morphological features in diseased vs healthy duodenal tissue. Design, Setting, and Participants: In this prospective diagnostic study, a CNN consisting of 4 convolutions, 1 fully connected layer, and 1 softmax layer was trained on duodenal biopsy images. Data were provided by 3 sites: Aga Khan University Hospital, Karachi, Pakistan; University Teaching Hospital, Lusaka, Zambia; and University of Virginia, Charlottesville. Duodenal biopsy slides from 102 children (10 with EE from Aga Khan University Hospital, 16 with EE from University Teaching Hospital, 34 with CD from University of Virginia, and 42 with no disease from University of Virginia) were converted into 3118 images. The CNN was designed and analyzed at the University of Virginia. The data were collected, prepared, and analyzed between November 2017 and February 2018. Main Outcomes and Measures: Classification accuracy of the CNN per image and per case and incorrect classification rate identified by aggregated 10-fold cross-validation confusion/error matrices of CNN models. Results: Overall, 102 children participated in this study, with a median (interquartile range) age of 31.0 (20.3-75.5) months and a roughly equal sex distribution, with 53 boys (51.9%). The model demonstrated 93.4% case-detection accuracy and had a false-negative rate of 2.4%. Confusion metrics indicated most incorrect classifications were between patients with CD and healthy patients. Feature map activations were visualized and learned distinctive patterns, including microlevel features in duodenal tissues, such as alterations in secretory cell populations. Conclusions and Relevance: A machine learning-based histopathological analysis model demonstrating 93.4% classification accuracy was developed for identifying and differentiating between duodenal biopsies from children with EE and CD. The combination of the CNN with a deconvolutional network enabled feature recognition and highlighted secretory cells' role in the model's ability to differentiate between these histologically similar diseases.

Original languageEnglish
Pages (from-to)e195822
JournalJAMA Network Open
Issue number6
StatePublished - 5 Jun 2019
Externally publishedYes


Dive into the research topics of 'Assessment of Machine Learning Detection of Environmental Enteropathy and Celiac Disease in Children'. Together they form a unique fingerprint.

Cite this