TY - JOUR
T1 - Breast cancer relative hazard estimates from case-control and cohort designs with missing data on mammographic density
AU - Chen, Jinbo
AU - Ayyagari, Rajeev
AU - Chatterjee, Nilanjan
AU - Pee, David Y.
AU - Schairer, Catherine
AU - Byrne, Celia
AU - Benichou, Jacques
AU - Gail, Mitchell H.
PY - 2008/9
Y1 - 2008/9
N2 - We analyzed data from the Breast Cancer Detection Demonstration Project (BCDDP) to obtain multivariate relative hazard models for breast cancer that included mammographic density (MD) in addition to standard risk factors. Data from the BCDDP were collected from a stratified case-control study in the screening phase (1973-1980) and from follow-up of three subcohorts in the follow-up phase (1980-1995). For both phases, MD measurements were only available for about half the women who developed breast cancer (cases) and a small fraction of noncases. We used a logistic regression model for the stratified case-control study and developed a general pseudo-likelihood approach to accommodate missing covariate data (MD) by adapting the method of Scott and Wild and Breslow and Holubkov. We showed that this method was substantially more efficient than a previously proposed weighted-likelihood method. We assumed piecewise exponential models for the analysis of each subcohort, with the missing covariate (MD) distribution conditional on the observed information modeled with polytomous logistic regression. We developed an EM algorithm for estimation, which allowed for time-varying covariates, incomplete follow-up, and left truncation. We analyzed the three follow-up subcohorts separately and then combined the relative hazard models from the case-control and cohort data. The final model included main effects for MD, weight, age at first live birth, number of previous breast biopsies, and number of sisters or mother with breast cancer and was more discriminating (higher concordance) than the original model of Gail et al., which included standard risk factors but not MD. In a separate work, we combined this relative hazard model with other data to project absolute breast cancer risk.
AB - We analyzed data from the Breast Cancer Detection Demonstration Project (BCDDP) to obtain multivariate relative hazard models for breast cancer that included mammographic density (MD) in addition to standard risk factors. Data from the BCDDP were collected from a stratified case-control study in the screening phase (1973-1980) and from follow-up of three subcohorts in the follow-up phase (1980-1995). For both phases, MD measurements were only available for about half the women who developed breast cancer (cases) and a small fraction of noncases. We used a logistic regression model for the stratified case-control study and developed a general pseudo-likelihood approach to accommodate missing covariate data (MD) by adapting the method of Scott and Wild and Breslow and Holubkov. We showed that this method was substantially more efficient than a previously proposed weighted-likelihood method. We assumed piecewise exponential models for the analysis of each subcohort, with the missing covariate (MD) distribution conditional on the observed information modeled with polytomous logistic regression. We developed an EM algorithm for estimation, which allowed for time-varying covariates, incomplete follow-up, and left truncation. We analyzed the three follow-up subcohorts separately and then combined the relative hazard models from the case-control and cohort data. The final model included main effects for MD, weight, age at first live birth, number of previous breast biopsies, and number of sisters or mother with breast cancer and was more discriminating (higher concordance) than the original model of Gail et al., which included standard risk factors but not MD. In a separate work, we combined this relative hazard model with other data to project absolute breast cancer risk.
KW - Missing at random
KW - Piecewise exponential model
KW - Pseudo-likelihood
KW - Two-phase stratified case-control study
UR - http://www.scopus.com/inward/record.url?scp=54949156455&partnerID=8YFLogxK
U2 - 10.1198/016214508000000120
DO - 10.1198/016214508000000120
M3 - Article
AN - SCOPUS:54949156455
SN - 0162-1459
VL - 103
SP - 976
EP - 988
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 483
ER -