Ensuring reliability of curated EHR-derived data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework

Published

June 2025

Citation

Estevez M, Singh N, Dyson L, et al. Ensuring Reliability of Curated EHR-Derived Data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework. arXiv. 2025. https://www.arxiv.org/abs/2506.08231

Overview

Artificial intelligence (AI) and large language models (LLMs) have revolutionized the ability to extract clinical data from electronic health records at scale. However, the very features that make LLMs so powerful also introduce new complexities and risks. This publication introduces the first comprehensive framework for evaluating the quality of LLM-extracted real-world oncology data, offering a rigorous yet practical way to ensure the accuracy and reliability of real-world data for research.

The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) framework is built on three key pillars:

Variable-Level Performance Metrics that benchmark LLM performance against expert human abstraction
Automated Verification Checks which systematically identify internal inconsistencies and implausibilities in the data
Replication & Benchmark Analyses that compare LLM-derived results with established clinical findings from human-abstracted or external data sets

Why this matters

As LLMs become increasingly utilized in real-world oncology research, it is essential to ensure that efficiency gains do not compromise data quality or patient representation. The VALID framework provides actionable tools to rigorously evaluate and generate insights to continuously improve LLM-extracted data. By not only quantifying accuracy but also identifying biases, inconsistencies, and areas for model improvement, this approach supports responsible AI adoption—helping maintain scientific integrity and fostering equitable, trustworthy evidence generation in cancer research.

New and Noteworthy

Ensuring reliability of curated EHR-derived data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework

Overview

Why this matters

Share

Posted in

More publications

ESMO AI & Digital Oncology

November 2025

A framework for evaluating performance of LLM-based extraction from the electronic health record across different healthcare systems

Seidl-Rathkopf K, Schwarz A, Viani N, et al.

ESMO AI & Digital Oncology

November 2025

Survival prediction in advanced NSCLC (aNSCLC) amid evolving standards of care (SOC): Digital twin modeling incorporating LLM-extracted clinical context

Estevez M, Griffith S, Williams T, et al.

ESMO AI & Digital Oncology

November 2025

A pan-tumor and pan-country approach to LLM-based extraction of systemic therapies from the electronic health record

Viani N, Groizard L, Harrison K, et al.