Skip to content

Structuring GDPR-compliant private networks to enable LLM-Extracted oncology data on pseudonymized patient EHR data in Europe

Published

November 2025

Citation

Ellsworth L, Groizard L, Stefan F, et al. Structuring GDPR-compliant private networks to enable LLM-Extracted oncology data on pseudonymized patient EHR data in Europe. ESMO AI & Digital Oncology. 2025.

Overview

The growing use of real-world data in global oncology requires scalable and high-quality electronic health record (EHR) curation. Traditional manual data extraction is resource-intensive, while fully automated methods may lack precision. 

This study introduces a GDPR-compliant hybrid abstraction platform combining large language models (LLMs) with expert human review. Utilizing a secure "lock box" approach, patient data is pseudonymized and processed within private networks, ensuring compliance with privacy regulations. LLMs extract clinical variables from unstructured documents, with human experts validating the outputs, all within a closed, secure environment.

Why this matters

This innovative approach addresses the challenges of data privacy and accuracy in multinational oncology research. By effectively integrating LLMs with human oversight, the platform ensures high-quality data extraction while maintaining stringent privacy standards. This method not only enhances the reliability of real-world data but also supports its use in regulatory and research contexts. The secure, private network architecture allows access to cutting-edge LLM tools without compromising patient confidentiality, paving the way for robust and compliant multinational research in oncology. This advancement is crucial for improving data-driven insights and ultimately enhancing patient care worldwide.

Share