Unlocking the patient journey: Comparing machine learning (ML) based natural language processing (NLP) and expert abstraction in understanding treatment patterns

Published

August 2023

Citation

Kalesinskas L, Benedum C, Fidyk E, Nemeth S, Cohen A, Estevez M. Unlocking the patient journey: Comparing machine learning (ML) based natural language processing (NLP) and expert abstraction in understanding treatment patterns. Poster presented at: ICPE 2023; August 23-27, 2023; Halifax, Nova Scotia, Canada. Accessed August 10, 2023. https://cdmcd.co/vanqGB

Summary

Gaining insights into the utilization and effectiveness of oral cancer therapies within electronic health records is crucial for identifying unmet needs and advancing drug development. However, the manual abstraction of this information from unstructured data by clinical experts is a time-consuming and resource-intensive task.

In this study, researchers aimed to investigate the influence of two different approaches to data curation, namely expert-abstraction and machine learning (ML) based NLP (ML-extraction), on the ability to accurately measure patient characteristics and capture real-world treatment patterns.

Why this matters

This study aimed to demonstrate that employing ML-extracted variables on expert-abstracted oncology data can yield comparable results in downstream analysis to using manual abstracted variables directly. By achieving similar outcomes, this approach has the potential to enable a comprehensive understanding of drug utilization and patient treatment profiles on a larger scale.

Read the research

Publications

Unlocking the patient journey: Comparing machine learning (ML) based natural language processing (NLP) and expert abstraction in understanding treatment patterns

Summary

Why this matters

Share

Posted in

More publications

AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning

July 2025

Using large language models for scalable extraction of real-world progression events across multiple cancer types

Cohen A, Krismer K, Magee K, et al.

arXiv

June 2025

Ensuring reliability of curated EHR-derived data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework

Estevez M, Singh N, Dyson L, et al.

ASCO Annual Meeting

May 2025

Concordance of response-based clinical trial and machine learning–generated real-world end points

Zhang Q, Krismer K, Lu Y, et al.