Skip to content

Survival prediction in advanced NSCLC (aNSCLC) amid evolving standards of care (SOC): Digital twin modeling incorporating LLM-extracted clinical context

Published

November 2025

Citation

Estevez M, Griffith S, Williams T, et al. Survival prediction in advanced NSCLC (aNSCLC) amid evolving standards of care (SOC): Digital twin modeling incorporating LLM-extracted clinical context. ESMO AI & Digital Oncology. 2025.

Overview

The integration of digital twin models with real-world data (RWD) is revolutionizing personalized healthcare and accelerating drug development. This study aims to predict overall survival (OS) outcomes for patients receiving first line standard of care in advanced non-small cell lung cancer (aNSCLC). In addition, the study  explores the reasons behind the choice of first-line treatments, particularly the omission of immunotherapy (IO) in favor of chemotherapy alone, to improve clinical relevance of the predictive model. 

Using the Flatiron Health Research Database, researchers trained a random survival forest (RSF) model to predict overall survival (OS) for patients receiving chemo+IO or chemo alone by incorporating clinical variables extracted by large language models (LLMs). The study analyzed 9,050 patients, identifying factors like PD-L1 status and autoimmune comorbidities as key reasons for IO omission. Researchers found the RSF trained on RWD from the EHR produced well-calibrated and discriminative OS predictions.

Why this matters

This research demonstrates the ability to accurately predict patient outcomes and identify reasons for treatment choices, enhancing the clinical relevance and transparency of predictive models. This approach not only supports the development of digital twin models that will potentially enable more personalized and informed treatment strategies, ultimately improving patient care in rapidly evolving treatment landscapes.

Share