TY - JOUR
T1 - Real-World Pitfalls of Analyzing Real-World Data
T2 - A Cautionary Note and Path Forward
AU - Cooper, John D.
AU - Shou, Karen
AU - Sunderland, Kevin
AU - Pham, Kevin
AU - Thornton, Jennifer A.
AU - DeStefano, Christin B.
PY - 2023/9/1
Y1 - 2023/9/1
N2 - PURPOSE: Real-world data (RWD) are pervasive in oncology research and offer insights into clinical trends and patient outcomes. However, RWD have shortcomings, making them prone to pitfalls during survival analyses. The American Society of Clinical Oncology (ASCO) CancerLinQ Discovery (CLQD) multiple myeloma (MM) data set was used to demonstrate some common pitfalls when analyzing survival from RWD: using incorrect surrogate markers for missing data and/or classification errors, ignoring deaths at time zero, and failing to account for guarantee-time bias. METHODS: The ASCO CLQD MM data set (July 19, 2021, release) was used to compare overall survival (OS) in patients with a known versus presumed date of MM diagnosis, in patients with secondary AML (sAML) with early deaths (ie, 0 months) included versus dropped, and in patients with second primary malignancies (SPMs) matched versus unmatched to control for time-related confounding factors (ie, guarantee-time bias). Analyses were conducted using STATA Version 17.0 (College Station, TX). RESULTS: In the CLQD MM data set, 28% of patients were missing a diagnosis date. Attempts to use the presumed diagnosis date (ie, first bortezomib or lenalidomide administration) as a surrogate marker for missing diagnosis dates were not successful as median OS was significantly different in patients with a recorded versus presumed diagnosis date (107 v 40 months, hazard ratio [HR], 2.5; 95% CI, 2.39 to 2.64; P < .001). Dropping deaths within 1 month of sAML diagnosis resulted in an exaggerated median OS (46 v 39 months). OS in patients with MM with SPMs differed substantially before and after incorporation of matching methods to account for guarantee-time bias (HR, 0.73; 95% CI, 0.67 to 0.78; P < .001 before matching, HR, 1.30; 95% CI, 1.18 to 1.43; P < .001 after matching). CONCLUSION: To fully maximize the benefits of RWD in oncology research, clinicians must be aware of analytic methods that can overcome pitfalls in survival analyses.
AB - PURPOSE: Real-world data (RWD) are pervasive in oncology research and offer insights into clinical trends and patient outcomes. However, RWD have shortcomings, making them prone to pitfalls during survival analyses. The American Society of Clinical Oncology (ASCO) CancerLinQ Discovery (CLQD) multiple myeloma (MM) data set was used to demonstrate some common pitfalls when analyzing survival from RWD: using incorrect surrogate markers for missing data and/or classification errors, ignoring deaths at time zero, and failing to account for guarantee-time bias. METHODS: The ASCO CLQD MM data set (July 19, 2021, release) was used to compare overall survival (OS) in patients with a known versus presumed date of MM diagnosis, in patients with secondary AML (sAML) with early deaths (ie, 0 months) included versus dropped, and in patients with second primary malignancies (SPMs) matched versus unmatched to control for time-related confounding factors (ie, guarantee-time bias). Analyses were conducted using STATA Version 17.0 (College Station, TX). RESULTS: In the CLQD MM data set, 28% of patients were missing a diagnosis date. Attempts to use the presumed diagnosis date (ie, first bortezomib or lenalidomide administration) as a surrogate marker for missing diagnosis dates were not successful as median OS was significantly different in patients with a recorded versus presumed diagnosis date (107 v 40 months, hazard ratio [HR], 2.5; 95% CI, 2.39 to 2.64; P < .001). Dropping deaths within 1 month of sAML diagnosis resulted in an exaggerated median OS (46 v 39 months). OS in patients with MM with SPMs differed substantially before and after incorporation of matching methods to account for guarantee-time bias (HR, 0.73; 95% CI, 0.67 to 0.78; P < .001 before matching, HR, 1.30; 95% CI, 1.18 to 1.43; P < .001 after matching). CONCLUSION: To fully maximize the benefits of RWD in oncology research, clinicians must be aware of analytic methods that can overcome pitfalls in survival analyses.
UR - http://www.scopus.com/inward/record.url?scp=85171809455&partnerID=8YFLogxK
U2 - 10.1200/CCI.23.00097
DO - 10.1200/CCI.23.00097
M3 - Article
C2 - 37729597
AN - SCOPUS:85171809455
SN - 2473-4276
VL - 7
SP - e2300097
JO - JCO Clinical Cancer Informatics
JF - JCO Clinical Cancer Informatics
ER -