Overview
Large language models (LLMs) are a promising tool for curating real-world data from electronic health records (EHRs) and have the potential to accurately and efficiently extract nuanced clinical details from the chart, such as biomarker results. Extracting PD-L1 testing details are particularly important given their relevance across cancer types and role in treatment decisions, but extracting these details is challenging. Results vary by cancer type (e.g., tumor proportion score vs. combined positive score), and documentation formats have changed over time, affecting how results are recorded in clinical charts.
This study evaluated whether pre-trained LLMs could accurately extract PD-L1 biomarker testing details from the EHR and whether fine-tuning them using high quality labeled data would improve data accuracy. Researchers found fine-tuned LLMs consistently outperformed the zero-shot approach (no fine-tuning), achieving high accuracy across ten cancer types and multiple biomarker details, including test and result dates, cell type, and percent staining. Notably, fine-tuned LLMs outperformed a deep-learning model trained on over 10,000 examples, despite using far fewer labeled examples (500–1,500). This finding suggests that fine-tuning with a relatively small, high-quality dataset can significantly enhance LLM performance for extracting clinical details from unstructured EHR documents.
Why this matters
To fully realize the promise of precision medicine and improve outcomes for patients with cancer, it is critical to accurately extract clinical details, such as PD-L1 results, from unstructured real-world data (RWD). This study demonstrates the potential of LLMs to efficiently and scalably curate complex biomarker results, emphasizing the importance of access to high-quality, labeled data to fine-tune LLMs for optimal accuracy as well as transparently evaluate LLMs’ performance, particularly given their propensity to hallucinate. Using LLMs to extract data with relevant clinical details at scale will be crucial as we continue to generate evidence to drive more impactful research, improve clinical decision making, and ultimately enable more personalized oncology care for patients.