Revealing facts and avoiding biases: A review of several common problems in statistical analyses of epidemiological data

Lihan Yan*, Yongmin Sun, Michael R. Boivin, Paul O. Kwon, Yuanzhang Li

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

8 Scopus citations

Abstract

This paper reviews several common challenges encountered in statistical analyses of epidemiological data for epidemiologists. We focus on the application of linear regression, multivariate logistic regression, and log-linear modeling to epidemiological data. Specific topics include: (a) deletion of outliers, (b) heteroscedasticity in linear regression, (c) limitations of principal component analysis in dimension reduction, (d) hazard ratio vs. odds ratio in a rate comparison analysis, (e) log-linear models with multiple response data, and (f) ordinal logistic vs. multinomial logistic models. As a general rule, a thorough examination of a model's assumptions against both current data and prior research should precede its use in estimating effects.

Original languageEnglish
Article number207
JournalFrontiers in Public Health
Volume4
Issue numberOCT
DOIs
StatePublished - 7 Oct 2016
Externally publishedYes

Keywords

  • Epidemiology
  • Hazard ratio
  • Log-linear
  • Logistic
  • Odds ratio
  • Principal component analysis
  • Regression
  • Relative risk

Fingerprint

Dive into the research topics of 'Revealing facts and avoiding biases: A review of several common problems in statistical analyses of epidemiological data'. Together they form a unique fingerprint.

Cite this