Skip to content

Ensuring reliability of curated EHR-derived data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework

Published

June 2025

Citation

Estevez M, Singh N, Dyson L, et al. Ensuring Reliability of Curated EHR-Derived Data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework. arXiv. 2025. https://www.arxiv.org/abs/2506.08231

Overview

Artificial intelligence (AI) and large language models (LLMs) have revolutionized the ability to extract clinical data from electronic health records at scale. However, the very features that make LLMs so powerful also introduce new complexities and risks. This publication introduces the first comprehensive framework for evaluating the quality of LLM-extracted real-world oncology data, offering a rigorous yet practical way to ensure the accuracy and reliability of real-world data for research.

The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) framework is built on three key pillars:

  • Variable-Level Performance Metrics that benchmark LLM performance against expert human abstraction
  • Automated Verification Checks which systematically identify internal inconsistencies and implausibilities in the data
  • Replication & Benchmark Analyses that compare LLM-derived results with established clinical findings from human-abstracted or external data sets

Why this matters

As LLMs become increasingly utilized in real-world oncology research, it is essential to ensure that efficiency gains do not compromise data quality or patient representation. The VALID framework provides actionable tools to rigorously evaluate and generate insights to continuously improve LLM-extracted data. By not only quantifying accuracy but also identifying biases, inconsistencies, and areas for model improvement, this approach supports responsible AI adoption—helping maintain scientific integrity and fostering equitable, trustworthy evidence generation in cancer research.

Read more

Share