It's remarkable to see how far we’ve come in cancer research and oncology practice over the past 25 years. Historically, physicians managed cancer with a one-size-fits-all approach based largely on the tumor's organ of origin, histology and disease stage. But one size does not fit all and that approach fails to benefit many patients. Thanks to the introduction of molecular profiling at the turn of the 21st century, we’ve made tremendous progress in our understanding of the complex biology of cancer and witnessed a revolution in patient care.1 Studies have shown that for many tumor types, precision oncology — the customization of a patient’s anti-cancer therapy based on the molecular characteristics of their cancer — generally results in superior outcomes compared to non-matched therapy.2
Nevertheless, we still have a great deal to learn about the complex and dynamic processes driving tumor progression and treatment resistance. This understanding is essential to meet the lofty promises of precision medicine: target the right drug with the right dose to the right patient at the right time.3 Many questions remain, such as what new targets should be pursued in therapeutic development; which treatment a given patient should receive and at what dose; how therapies should be sequenced; and when a potentially toxic therapy should be replaced by disease monitoring.
Data from basic lab research, pre-clinical studies and clinical trials are undoubtedly critical in our ongoing fight against cancer and advancing our understanding of the relationship between tumor biology and outcomes. However, high quality data from the real-world setting (i.e., outside of the clinical trial setting) are also essential in this fight, to ensure that different types of patients, biologies and treatment contexts are fairly represented in oncology research and to uncover novel insights into the biology behind clinical outcomes among large and diverse study populations.
Let’s review how recent advances in biomarker testing are boosting the type and quality of profiling data available in real-world datasets, and how researchers are using those rich datasets to address previously unaddressable research questions.
Biomarker testing technology
It’s impossible to think of precision medicine without considering the technological breakthrough that was next-generation sequencing (NGS) in the early 2000s and its refinements over the past twenty years.4 By dramatically reducing the time and cost involved in the sequencing process, NGS has made it possible, for example, to execute whole-genome sequencing to detect genetic variants responsible for a particular disease, conduct whole transcriptome sequencing to detect changes in the sequence or expression level of RNA associated with a diagnosis, and, in combination with methods such as chromatin immunoprecipitation, map DNA-protein interactions to study disease progression over time in a research setting.
Today, physicians can order and analyze complete genetic and molecular profiles for their patients via comprehensive “omic” approaches (e.g., whole genome, exome, transcriptome, proteome, and metabolome). Matched normal-tumor sequencing can be used to distinguish germline (inherited) from somatic (acquired) mutations.5 Levels of circulating tumor DNA in liquid biopsies using blood samples instead of tumor tissue samples can be assessed to track response and resistance to therapy over time, monitor clonal evolution during cancer metastasis, and quickly recognize signs of recurrence or residual disease using much less invasive methods of sample collection. Looking ahead, the increased integration of methods such as ultrarapid nanopore genome sequencing in routine clinical care will allow for the routine identification of pathogenic variants in a matter of hours, greatly enhancing clinical management at critical times in the progression of the disease.6
Gaps in knowledge and biomarker testing utilization
Despite those advances — and the fact that most patients with recurrent, relapsed, refractory, metastatic, or advanced cancer today can expect their health insurance plan or medicare to cover the expense — the comprehensive molecular characterization of patients’ tumors is still not universally implemented in routine clinical practice. For instance, studies have shown that the majority of patients with advanced non-small cell lung cancer treated in the community setting do not receive all biomarker testing recommended by national practice guidelines.7 Barriers may include reimbursement and financial concerns, lack of education or awareness about test utility and approvals, lack of access to decision support tools, insufficient tissue, or long turnaround time from the ordering of a test to receiving results.8 And persisting racial and ethnic inequities have unfortunately been well documented at the practice and physician levels.9 Nevertheless, there is a powerful and rapidly growing body of tumor molecular profiling data coming from the real-world setting that will help to fuel advances in precision oncology.
While many of the more common actionable targets in oncology, or ‘low hanging fruits,’ have already been explored in drug development and contributed to enormous progress in the treatment landscape, there are still large unmet needs and millions of patients who remain unresponsive to currently available therapeutic options.
There is an urgent need to identify additional actionable targets and novel therapeutic approaches that require thinking outside the box and leveraging novel research approaches. One of those novel approaches is the use of linked real-world clinico-multiomic datasets.
Considerations for research involving real-world clinico-multiomic datasets
When designing research involving real-world molecular profiling data, will any real-world dataset do? Unfortunately, many existing real-world, clinico-multiomic datasets are riddled with issues that limit their power to address research objectives with sufficient scientific rigor. For example, they may have missing or incomplete longitudinal clinical data (required to truly understand the clinical context around tumor biology), questionable data quality, small sample sizes of patients with complete clinical and molecular data, they may fail to be representative of the broader real-world patient population, or they may be non-contemporaneous.
Importantly, a real-world clinico-multiomics dataset should have sufficient numbers of patients with all the required molecular data (e.g., whole transcriptome sequencing) linked to complete and high quality clinical data, with coverage across all the key prognostic factors and validated clinical endpoints necessary to address the research question at hand. Adequate data recency is also critical to explore the relationship between tumor biology and the current standard of care treatment. Follow-up time should be considered when incorporating analyses around clinical outcomes. Additionally, the dataset should be representative to avoid biased analyses and yield insights that closely align with the population of interest. Ideally, a real-world dataset can be audited, pressure-tested, and refined via access to source materials (e.g., unstructured data).
The Flatiron-Caris Clinical-Molecular Database (CMDB)
The Clinical-Molecular Database (CMDB), a joint project between Flatiron Health and Caris Life Sciences®, was built to address the shortcomings of existing real-world datasets and unlock research use cases that previously could not be investigated. It’s a large, modern patient-level dataset that combines the depth, quality, and completeness of Flatiron’s clinical data with detailed molecular data from Caris for tens of thousands of patients. It’s representative of real-life practice (77% community oncology vs. 17% academic medical centers10) and growing at a rate of a thousand new patients every month.
The Flatiron side of the dataset includes critical patient demographics and clinical characteristics, treatment information, predictive and prognostic factors over the course of the patient journey. It includes validated, published, and peer reviewed real-world outcomes based on longitudinal patient records, and data that is traceable to the point of care. The Caris side of the dataset includes whole exome sequencing and 592 gene panel, whole transcriptome sequencing, and digital pathology imaging data (e.g., H&E and IHC).
Researchers can use the CMDB for a wide variety of research objectives, such as identifying novel targets; profiling target prevalence across indications and uncovering associated clinical and biological features; target product profile refinement; uncovering novel mechanisms of response or resistance to a therapy; exploring the biology associated with certain adverse events; or identifying potential new predictive or prognostic biomarkers. The data can be mined using artificial intelligence-based approaches, such as machine learning, to improve prediction of outcomes. And it can be used not just for comprehensive hypothesis generation but also to validate hypotheses already being evaluated in preclinical and clinical research.
What sets the CMDB apart from other datasets is that it allows for relationships between clinical endpoints that are often challenging to obtain, the mutational landscape, patterns of gene expression, and protein expression to be explored, all in the context of the wider population of patients managed in the real-world or who would benefit from a new therapeutic asset.
By leveraging the CMDB at every stage of the drug development life cycle, researchers can fuel and accelerate the development of innovative, practice-changing therapeutic approaches.
A new era in precision oncology research
It’s time to capitalize on recent progress in biomarker testing and enhanced data collection capabilities at the point of care to inspire the development of new precision medicine therapies and meet the needs of today’s cancer patients in all their diversity.
That can only be accomplished with a best-in-class real-world dataset that represents the population of interest, covers the whole patient journey, contains recent data, and is large enough to identify rare sub-cohorts and hold statistical power.
To learn more about Flatiron or the Clinical-Molecular Database, please contact us.
Join us for our upcoming panel discussion on October 10th, 2024: Accelerating Precision Oncology Research With Linked Clinical and Multiomic Real World Data, featuring experts from Flatiron Health, Caris Life Sciences, and Penn Medicine.
1. Doroshow DB, Doroshow JH. Genomics and the History of Precision Oncology. Surg Oncol Clin N Am. 2020 Jan;29(1):35-49
2. Tsimberidou AM, Fountzilas E, Nikanjam M, Kurzrock R. Review of precision cancer medicine: Evolution of the treatment paradigm. Cancer Treat Rev. 2020 Jun;86:102019
3. FDA - Precision Medicine
4. Akintunde O, Tucker T, Carabetta VJ. The evolution of next-generation sequencing technologies. ArXiv [Preprint]. 2023 May 15
5. Mandelker D, Ceyhan-Birsoy O. Evolving Significance of Tumor-Normal Sequencing in Cancer Care. Trends Cancer. 2020 Jan
6. Gorzynski, John E. et al. 2022. Ultrarapid Nanopore Genome Sequencing in a Critical Care Setting. New England Journal of Medicine. Vol 386 No. 7
7. Mason C et al. Patterns of Biomarker Testing Rates and Appropriate Use of Targeted Therapy in the First-Line, Metastatic Non-Small Cell Lung Cancer Treatment Setting. J Clin Pathw. 2018 Jan-Feb
8. Adekunle AD, Coombs S, Fritz CDL. Opportunities for Improving System-Level Barriers to Biomarker Testing for Metastatic Colorectal Cancer. JAMA Netw Open. 2024
9. Gregory A. Vidal et al. Racial and ethnic inequities at the practice and physician levels in timely next-generation sequencing for patients with advanced non–small-cell lung cancer treated in the US community setting. JCO Oncology Practice, Vol 20, No 3, Jan 09, 2024
10. Remaining 6% of patients are from both community oncology and academic medical centers