Skip to content

Assessing quality of a LLM-derived prostate cancer (PC) real-world dataset: an application of the validation of accuracy for LLM/ML-extracted information and data (VALID) framework

Published

April 2026

Citation

Ward PJ, Qian Y, Hankinson EA, Dolor A, Estevez M, et al. Assessing quality of a large language model (LLM)-derived prostate cancer (PC) real-world dataset: an application of the validation of accuracy for LLM/ML-extracted information and data (VALID) framework. ISPOR. 2026. https://www.ispor.org/heor-resources/presentations-database/presentation-cti/ispor-2026/poster-session-2-4/assessing-quality-of-a-large-language-model-llm-derived-prostate-cancer-pc-real-world-dataset-an-application-of-the-validation-of-accuracy-for-llm-ml-extracted-information-and-data-valid-framework

Overview

Large language models (LLMs) are increasingly used to extract clinical information from electronic health records, offering a scalable alternative to manual data abstraction. However, ensuring the accuracy and reliability of LLM-derived data is essential before it can be used in research.

In this study, researchers applied the VALID framework—a comprehensive approach to evaluating data quality—to a large LLM-derived prostate cancer dataset. They compared LLM-extracted variables to human-abstracted data, conducted consistency checks, and replicated key clinical outcomes. The results showed that LLM performance was very similar to manual abstraction, with only small differences in accuracy and highly consistent survival estimates across datasets.

Why this matters

This study demonstrates that LLMs can generate high-quality real-world data suitable for research when rigorously validated. By enabling scalable data extraction without sacrificing accuracy, LLMs can help expand the scope and speed of real-world evidence generation in oncology.

Read the research

Share