Breast cancer relative hazard estimates from case-control and cohort designs with missing data on mammographic density

Jinbo Chen*, Rajeev Ayyagari, Nilanjan Chatterjee, David Y. Pee, Catherine Schairer, Celia Byrne, Jacques Benichou, Mitchell H. Gail

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


We analyzed data from the Breast Cancer Detection Demonstration Project (BCDDP) to obtain multivariate relative hazard models for breast cancer that included mammographic density (MD) in addition to standard risk factors. Data from the BCDDP were collected from a stratified case-control study in the screening phase (1973-1980) and from follow-up of three subcohorts in the follow-up phase (1980-1995). For both phases, MD measurements were only available for about half the women who developed breast cancer (cases) and a small fraction of noncases. We used a logistic regression model for the stratified case-control study and developed a general pseudo-likelihood approach to accommodate missing covariate data (MD) by adapting the method of Scott and Wild and Breslow and Holubkov. We showed that this method was substantially more efficient than a previously proposed weighted-likelihood method. We assumed piecewise exponential models for the analysis of each subcohort, with the missing covariate (MD) distribution conditional on the observed information modeled with polytomous logistic regression. We developed an EM algorithm for estimation, which allowed for time-varying covariates, incomplete follow-up, and left truncation. We analyzed the three follow-up subcohorts separately and then combined the relative hazard models from the case-control and cohort data. The final model included main effects for MD, weight, age at first live birth, number of previous breast biopsies, and number of sisters or mother with breast cancer and was more discriminating (higher concordance) than the original model of Gail et al., which included standard risk factors but not MD. In a separate work, we combined this relative hazard model with other data to project absolute breast cancer risk.

Original languageEnglish
Pages (from-to)976-988
Number of pages13
JournalJournal of the American Statistical Association
Issue number483
StatePublished - Sep 2008
Externally publishedYes


  • Missing at random
  • Piecewise exponential model
  • Pseudo-likelihood
  • Two-phase stratified case-control study


Dive into the research topics of 'Breast cancer relative hazard estimates from case-control and cohort designs with missing data on mammographic density'. Together they form a unique fingerprint.

Cite this