Our summary
The digitization of healthcare has increased the availability of real-world data (RWD), which is important for generating real-world evidence (RWE) to support comparative effectiveness research, better understand patient populations and clinical outcomes, and inform clinical trial planning, including increasing clinical trial diversity and representativeness. However, 80% of RWD is trapped within unstructured documents such as clinician notes or scanned lab reports, making it difficult to extract relevant data. Traditional methods like manual chart review by clinical experts are time-consuming and resource-intensive, limiting research opportunities and scale.
Natural language processing (NLP) with machine learning (ML) techniques (i.e., ML-extraction) is increasingly being applied to electronic health records (EHRs) to extract clinically relevant information and enable research and RWE generation at scale.
While there is growing attention and guidance around RWD, there remains a gap in evaluating the quality and performance of RWD curated using ML extraction. Flatiron Health researchers previously developed a research-centric evaluation framework to assess ML-extracted RWD and provide insights on model performance, strengths, limitations, and fitness-for-use. However, the previous framework primarily focused on evaluating a single ML-extracted variable. There is a need for replication analyses leveraging datasets containing several ML-extracted variables to understand the reproducibility of analytic results and scientific conclusions between ML-extraction and traditional processing approaches (i.e. expert abstraction).
In this study, researchers designed example oncology retrospective studies for common archetypes describing how EHR-derived data is used in observational research, such as defining baseline characteristics, describing the natural history of disease, and measuring treatment comparative effectiveness. They evaluated whether the use of ML-extracted data leads to the same conclusions as expert-abstracted data for each archetype.
Why this matters
This study aims to evaluate the quality and performance of ML-extracted RWD and its use for research and RWE generation at scale, which has become crucial for clinical and health outcomes research as researchers aim to comprehend smaller patient populations and evolving care standards.