From data to decisions: What “regulatory-grade” real-world evidence actually requires

At the Precision Medicine World Conference (PMWC) earlier this month, I joined colleagues from academia and industry to discuss a question that’s becoming increasingly central to oncology research: what does it take to translate real-world data into evidence and intelligence that can meaningfully inform decisions?

During the conversation, we reflected on how the term “regulatory-grade real-world evidence” gets tossed around so much in industry and that it often is positioned like a static label permanently attached to a dataset. Real-world evidence supports a wide range of use cases across the lifecycle - from understanding disease patterns and treatment pathways in the pre-launch setting to supporting post-launch safety monitoring and effectiveness studies - not just the high-profile applications like external control arms.

In reality, it doesn't work that way. A dataset that meets the “regulatory-grade” definition for one question may not meet it for another. It is a fit-for-purpose standard that must be earned for each individual research question.
So, what does it actually take to earn that grade?

Relevance and Reliability

We discussed how that "grade" is built on two foundational pillars:

Relevance: A dataset is only as good as its ability to reflect the current clinical reality, including the specific biomarkers, the latest treatment pathways, and the nuances of the patient journey. From a regulatory perspective, this means the data source must capture key variables (both their presence and their absence) and ensure a sufficient number of patients with adequate follow-up time. If you don't have the runway to see how a patient actually fared, the data can't tell the full story.
Reliability: This looks at variable completeness and accuracy alongside traceability and provenance. If a regulator asks why a patient was classified a certain way, we need to be able to show our work, tracing that specific conclusion all the way back to the original source.

Extraction is a Forensic Investigation at Scale

Establishing these pillars is only half the battle. The real differentiator, and the biggest hurdle in our industry, is how we actually get that data out of the system.

Often, we use the term "data extraction." To me, that phrase implies a technical cleanup exercise. In reality, this work is more like a clinical and forensic investigation. An EHR isn't a research database; it’s a tool built for doctors to treat patients. The way care is documented differs wildly across care settings, cities, and even where a physician went to medical school. Harmonizing these distinct "clinical dialects," or creating standards for pathology endpoints that don’t yet exist, is the hard, unglamorous work of turning data into decision grade evidence.

As researchers often say, absence of evidence is not evidence of absence. If a scan result doesn’t appear in the EHR, it doesn’t necessarily mean it never happened; it may have occurred at another facility. Addressing these gaps requires "data empathy" and a deep understanding of the real environment where the data was first captured.

This is where technology becomes the true differentiator. Today, our curation methods are so advanced that the question is no longer whether we can extract something from the chart, but whether the signal exists in the data source in the first place. We have moved past manual digging. By applying data empathy through advanced technology and AI-enabled curation in the hands of clinical and data experts, we can now triangulate signals across the record and identify corroborating evidence at a scale that was previously impossible. We aren't just extracting the past; we are building a high-fidelity map of the patient experience. This is how we can ensure that the lived experience of patients meaningfully shapes the future of research and care.

Building Relationships to Evolve the Field

Even the best tech and the cleanest data in the world won’t move the needle if they exist in a vacuum. Building trust in real-world evidence requires radical transparency with regulators, researchers, and clinicians. It requires being open about our methods, our limitations, and our study designs. True engagement means we aren't just handing over a report; we’re participating in a continuously learning ecosystem. It's human collaboration between those who generate the data and those who use the evidence that turns a dataset into a decision-informing asset.

Looking Ahead: Closing the Gap

As I step back into my daily work after PMWC, I’m left with one central reflection: we must close the gap between the point of care and research.

Today, we spend enormous energy looking backward, retroactively trying to structure information recorded months or years ago. But imagine a future where research-ready data is the natural byproduct of great care. Imagine an EHR that acts as a clinical co-pilot, where data captured during a routine visit feeds back insights to the physician in real time, without adding burden, while simultaneously helping the next researcher answer a life-saving question.

Ultimately, that is the goal: transforming data into the insights that enable faster, smarter decisions for the people who matter most, the patients.

New and Noteworthy

From data to decisions: What “regulatory-grade” real-world evidence actually requires

Relevance and Reliability

Extraction is a Forensic Investigation at Scale

Building Relationships to Evolve the Field

Looking Ahead: Closing the Gap

Share

Posted in

More to explore