Skip to content

Japanese language EHR LLM extraction of longitudinal unstructured ECOG performance status

Published

March 2026

Citation

Adamson B, Zhang Y, Chamby A, et al. Japanese language EHR LLM extraction of longitudinal unstructured ECOG performance status. JSMO Annual Meeting. 2026.

Overview

Accurately capturing patient performance status is essential for oncology research and clinical decision-making, yet in Japan, the Eastern Cooperative Oncology Group (ECOG) performance status is typically recorded only in unstructured clinical notes, making data collection challenging and resource-intensive. 

This study evaluated whether large language models (LLMs) could automatically extract ECOG performance status information from Japanese electronic health records, potentially unlocking scalable real-world data curation capabilities. When compared to manual human abstraction, the LLM achieved 100% sensitivity and 100% precision in identifying performance status values. Notably, in cases where human reviewers missed information, the model identified performance status in approximately 50% of those records. The computational cost of using this automated approach was estimated at less than 5% of the cost for traditional manual data abstraction.

Why this matters

This research demonstrates that LLM-based extraction of ECOG performance status from Japanese clinical notes is both highly accurate and cost-effective. By automating this process, researchers can now scale real-world data collection across Japan more efficiently, enabling richer longitudinal datasets for cancer research. Future advancements on this approach in Japan will build on Flatiron’s focused efforts to validate LLM-extracted real-world data globally, including publications like the VALID Framework—the industry's first comprehensive approach to evaluating AI-extracted real-world data. Finally, this work supports international research collaboration by creating consistent, harmonized clinical data across countries—ultimately improving our ability to conduct meaningful real-world evidence studies that inform treatment decisions for cancer patients globally.

Share