Robust and accurate estimation of cellular fraction from tissue omics data via ensemble deconvolution

Manqi Cai, Molin Yue, Tianmeng Chen, Jinling Liu, Erick Forno, Xinghua Lu, Timothy Billiar, Juan Celedón, Chris Mckennan, Wei Chen*, Jiebiao Wang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Motivation: Tissue-level omics data such as transcriptomics and epigenomics are an average across diverse cell types. To extract cell-type-specific (CTS) signals, dozens of cellular deconvolution methods have been proposed to infer cell-type fractions from tissue-level data. However, these methods produce vastly different results under various real data settings. Simulation-based benchmarking studies showed no universally best deconvolution approaches. There have been attempts of ensemble methods, but they only aggregate multiple single-cell references or reference-free deconvolution methods. Results: To achieve a robust estimation of cellular fractions, we proposed EnsDeconv (Ensemble Deconvolution), which adopts CTS robust regression to synthesize the results from 11 single deconvolution methods, 10 reference datasets, 5 marker gene selection procedures, 5 data normalizations and 2 transformations. Unlike most benchmarking studies based on simulations, we compiled four large real datasets of 4937 tissue samples in total with measured cellular fractions and bulk gene expression from different tissues. Comprehensive evaluations demonstrated that EnsDeconv yields more stable, robust and accurate fractions than existing methods. We illustrated that EnsDeconv estimated cellular fractions enable various CTS downstream analyses such as differential fractions associated with clinical variables. We further extended EnsDeconv to analyze bulk DNA methylation data.

Original languageEnglish
Pages (from-to)3004-3010
Number of pages7
JournalBioinformatics
Volume38
Issue number11
DOIs
StatePublished - 1 Jun 2022
Externally publishedYes

Fingerprint

Dive into the research topics of 'Robust and accurate estimation of cellular fraction from tissue omics data via ensemble deconvolution'. Together they form a unique fingerprint.

Cite this