TY - JOUR
T1 - Unraveling the complex relationship between mRNA and protein abundances
T2 - a machine learning-based approach for imputing protein levels from RNA-seq data
AU - Prabahar, Archana
AU - Zamora, Ruben
AU - Barclay, Derek
AU - Yin, Jinling
AU - Ramamoorthy, Mahesh
AU - Bagheri, Atefeh
AU - Johnson, Scott A.
AU - Badylak, Stephen
AU - Vodovotz, Yoram
AU - Jiang, Peng
N1 - Publisher Copyright:
© The Author(s) 2024. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.
PY - 2024/3/1
Y1 - 2024/3/1
N2 - The correlation between messenger RNA (mRNA) and protein abundances has long been debated. RNA sequencing (RNA-seq), a high-throughput, commonly used method for analyzing transcriptional dynamics, leaves questions about whether we can translate RNA-seq-identified gene signatures directly to protein changes. In this study, we utilized a set of 17 widely assessed immune and wound healing mediators in the context of canine volumetric muscle loss to investigate the correlation of mRNA and protein abundances. Our data reveal an overall agreement between mRNA and protein levels on these 17 mediators when examining samples from the same experimental condition (e.g. the same biopsy). However, we observed a lack of correlation between mRNA and protein levels for individual genes under different conditions, underscoring the challenges in converting transcriptional changes into protein changes. To address this discrepancy, we developed a machine learning model to predict protein abundances from RNA-seq data, achieving high accuracy. Our approach also effectively corrected multiple extreme outliers measured by antibody-based protein assays. Additionally, this model has the potential to detect post-translational modification events, as shown by accurately estimating activated transforming growth factor β1 levels. This study presents a promising approach for converting RNA-seq data into protein abundance and its biological significance.
AB - The correlation between messenger RNA (mRNA) and protein abundances has long been debated. RNA sequencing (RNA-seq), a high-throughput, commonly used method for analyzing transcriptional dynamics, leaves questions about whether we can translate RNA-seq-identified gene signatures directly to protein changes. In this study, we utilized a set of 17 widely assessed immune and wound healing mediators in the context of canine volumetric muscle loss to investigate the correlation of mRNA and protein abundances. Our data reveal an overall agreement between mRNA and protein levels on these 17 mediators when examining samples from the same experimental condition (e.g. the same biopsy). However, we observed a lack of correlation between mRNA and protein levels for individual genes under different conditions, underscoring the challenges in converting transcriptional changes into protein changes. To address this discrepancy, we developed a machine learning model to predict protein abundances from RNA-seq data, achieving high accuracy. Our approach also effectively corrected multiple extreme outliers measured by antibody-based protein assays. Additionally, this model has the potential to detect post-translational modification events, as shown by accurately estimating activated transforming growth factor β1 levels. This study presents a promising approach for converting RNA-seq data into protein abundance and its biological significance.
UR - http://www.scopus.com/inward/record.url?scp=85184926243&partnerID=8YFLogxK
U2 - 10.1093/nargab/lqae019
DO - 10.1093/nargab/lqae019
M3 - Article
AN - SCOPUS:85184926243
SN - 2631-9268
VL - 6
JO - NAR Genomics and Bioinformatics
JF - NAR Genomics and Bioinformatics
IS - 1
M1 - lqae019
ER -