A prognostic system for epithelial ovarian carcinomas using machine learning

Philip M. Grimley, Zhenqiu Liu, Kathleen M. Darcy, Matthew T. Hueman, Huan Wang, Li Sheng, Donald E. Henson, Dechang Chen*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

9 Scopus citations


Introduction: Integrating additional factors into the International Federation of Gynecology and Obstetrics (FIGO) staging system is needed for accurate patient classification and survival prediction. In this study, we tested machine learning as a novel tool for incorporating additional prognostic parameters into the conventional FIGO staging system for stratifying patients with epithelial ovarian carcinomas and evaluating their survival. Material and methods: Cancer-specific survival data for epithelial ovarian carcinomas were extracted from the Surveillance, Epidemiology, and End Results (SEER) program. Two datasets were constructed based upon the year of diagnosis. Dataset 1 (39 514 cases) was limited to primary tumor (T), regional lymph nodes (N) and distant metastasis (M). Dataset 2 (25 291 cases) included additional parameters of age at diagnosis (A) and histologic type and grade (H). The Ensemble Algorithm for Clustering Cancer Data (EACCD) was applied to generate prognostic groups with depiction in dendrograms. C-indices provided dendrogram cutoffs and comparisons of prediction accuracy. Results: Dataset 1 was stratified into nine epithelial ovarian carcinoma prognostic groups, contrasting with 10 groups from FIGO methodology. The EACCD grouping had a slightly higher accuracy in survival prediction than FIGO staging (C-index = 0.7391 vs 0.7371, increase in C-index = 0.0020, 95% confidence interval [CI] 0.0012–0.0027, p = 1.8 × 10−7). Nevertheless, there remained a strong inter-system association between EACCD and FIGO (rank correlation = 0.9480, p = 6.1 × 10−15). Analysis of Dataset 2 demonstrated that A and H could be smoothly integrated with the T, N and M criteria. Survival data were stratified into nine prognostic groups with an even higher prediction accuracy (C-index = 0.7605) than when using only T, N and M. Conclusions: EACCD was successfully applied to integrate A and H with T, N and M for stratification and survival prediction of epithelial ovarian carcinoma patients. Additional factors could be advantageously incorporated to test the prognostic impact of emerging diagnostic or therapeutic advances.

Original languageEnglish
Pages (from-to)1511-1519
Number of pages9
JournalActa Obstetricia et Gynecologica Scandinavica
Issue number8
StatePublished - Aug 2021
Externally publishedYes


  • C-index
  • dendrogram
  • machine learning
  • ovarian carcinoma
  • staging
  • survival


Dive into the research topics of 'A prognostic system for epithelial ovarian carcinomas using machine learning'. Together they form a unique fingerprint.

Cite this