Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA

Blog

HomeHome / Blog / Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA

Apr 30, 2023

Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA

Nature volume 616, pages

Nature volume 616, pages 553–562 (2023)Cite this article

17k Accesses

5 Citations

421 Altmetric

Metrics details

Circulating tumour DNA (ctDNA) can be used to detect and profile residual tumour cells persisting after curative intent therapy1. The study of large patient cohorts incorporating longitudinal plasma sampling and extended follow-up is required to determine the role of ctDNA as a phylogenetic biomarker of relapse in early-stage non-small-cell lung cancer (NSCLC). Here we developed ctDNA methods tracking a median of 200 mutations identified in resected NSCLC tissue across 1,069 plasma samples collected from 197 patients enrolled in the TRACERx study2. A lack of preoperative ctDNA detection distinguished biologically indolent lung adenocarcinoma with good clinical outcome. Postoperative plasma analyses were interpreted within the context of standard-of-care radiological surveillance and administration of cytotoxic adjuvant therapy. Landmark analyses of plasma samples collected within 120 days after surgery revealed ctDNA detection in 25% of patients, including 49% of all patients who experienced clinical relapse; 3 to 6 monthly ctDNA surveillance identified impending disease relapse in an additional 20% of landmark-negative patients. We developed a bioinformatic tool (ECLIPSE) for non-invasive tracking of subclonal architecture at low ctDNA levels. ECLIPSE identified patients with polyclonal metastatic dissemination, which was associated with a poor clinical outcome. By measuring subclone cancer cell fractions in preoperative plasma, we found that subclones seeding future metastases were significantly more expanded compared with non-metastatic subclones. Our findings will support (neo)adjuvant trial advances and provide insights into the process of metastatic dissemination using low-ctDNA-level liquid biopsy.

This is a preview of subscription content, access via your institution

Open Access articles citing this article.

Nature Open Access 12 April 2023

Nature Open Access 12 April 2023

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

$29.99 / 30 days

cancel any time

Subscribe to this journal

Receive 51 print issues and online access

$199.00 per year

only $3.90 per issue

Rent or buy this article

Get just this article for as long as you need it

$39.95

Prices may be subject to local taxes which are calculated during checkout

The cfDNA sequencing files, RNA-seq data and multiregion tumour exome sequencing data (in each case from the TRACERx study) used or analysed during this study have been deposited at the European Genome–phenome Archive (EGA), hosted by The European Bioinformatics Institute (EBI) and the Centre for Genomic Regulation (CRG) under accession codes EGAS00001006494, EGAS00001006517 and EGAS00001006494 and is under controlled access owing to the nature of the data and commercial partnership arrangements. Details on how to apply for access are available on the linked page.

ECLIPSE is available as an R package to install from github (https://github.com/amf71/ECLIPSE) which is only available for academic non-commercial research purposes. Code used to produce the figures in this paper is available on request.

Moding, E. J., Nabet, B. Y., Alizadeh, A. A. & Diehn, M. Detecting liquid remnants of solid tumors: circulating tumor DNA minimal residual disease. Cancer Discov. 11, 2968–2986 (2021).

Article CAS PubMed PubMed Central Google Scholar

Jamal-Hanjani, M. et al. Tracking the evolution of non-small-cell lung cancer. N. Engl. J. Med. 376, 2109–2121 (2017).

Article CAS PubMed Google Scholar

Chabon, J. J. et al. Integrating genomic features for non-invasive early lung cancer detection. Nature 580, 245–251 (2020).

Article ADS CAS PubMed PubMed Central Google Scholar

Peng, M. et al. Circulating tumor DNA as a prognostic biomarker in localized non-small cell lung cancer. Front. Oncol. 10, 561598 (2020).

Article PubMed PubMed Central Google Scholar

Xia, L. et al. Perioperative ctDNA-based molecular residual disease detection for non-small cell lung cancer: a prospective multicenter cohort study (LUNGCA-1). Clin. Cancer Res. 28, 3308–3317 (2021).

Article Google Scholar

Chaudhuri, A. A. et al. Early detection of molecular residual disease in localized lung cancer by circulating tumor DNA profiling. Cancer Discov. 7, 1394–1403 (2017).

Article CAS PubMed PubMed Central Google Scholar

Abbosh, C. et al. Phylogenetic ctDNA analysis depicts early-stage lung cancer evolution. Nature 545, 446–451 (2017).

Article ADS CAS PubMed PubMed Central Google Scholar

Gale, D. et al. Residual ctDNA after treatment predicts early relapse in patients with early-stage non-small cell lung cancer. Ann. Oncol. 33, 500–510 (2022).

Zhang, J.-T. et al. Longitudinal undetectable molecular residual disease defines potentially cured population in localized non-small cell lung cancer. Cancer Discov. 12, 1690–1701 (2022).

Powles, T. et al. ctDNA guiding adjuvant immunotherapy in urothelial carcinoma. Nature 595, 432–437 (2021).

Article ADS CAS PubMed Google Scholar

Tie, J. et al. Circulating tumor DNA analysis guiding adjuvant therapy in stage II colon cancer. N. Engl. J. Med. 386, 2261–2272 (2022).

Article CAS PubMed PubMed Central Google Scholar

Parikh, A. R. et al. Liquid versus tissue biopsy for detecting acquired resistance and tumor heterogeneity in gastrointestinal cancers. Nat. Med. 25, 1415–1421 (2019).

Article CAS PubMed PubMed Central Google Scholar

Murtaza, M. et al. Multifocal clonal evolution characterized using circulating tumour DNA in a case of metastatic breast cancer. Nat. Commun. 6, 8760 (2015).

Article ADS PubMed Google Scholar

Herberts, C. et al. Deep whole-genome ctDNA chronology of treatment-resistant prostate cancer. Nature 608, 199–208 (2022).

Article ADS CAS PubMed Google Scholar

Lung Cancer: Diagnosis and Management NICE Guideline NG122 (NICE, 2019).

Zheng, Z. et al. Anchored multiplex PCR for targeted next-generation sequencing. Nat. Med. 20, 1479–1484 (2014).

Article CAS PubMed Google Scholar

Abbosh, C., Birkbak, N. J. & Swanton, C. Early stage NSCLC—challenges to implementing ctDNA-based screening and MRD detection. Nat. Rev. Clin. Oncol. 15, 577–586 (2018).

Article CAS PubMed Google Scholar

Newman, A. M. et al. An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat. Med. 20, 548–554 (2014).

Article CAS PubMed PubMed Central Google Scholar

Hänzelmann, S., Castelo, R. & Guinney, J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 14, 7 (2013).

Article Google Scholar

Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).

Article CAS PubMed PubMed Central Google Scholar

Biswas, D. et al. A clonal expression biomarker associates with lung cancer mortality. Nat. Med. 25, 1540–1548 (2019).

Article CAS PubMed PubMed Central Google Scholar

Burrell, R. A. et al. Replication stress links structural and numerical cancer chromosomal instability. Nature 494, 492–496 (2013).

Article ADS CAS PubMed PubMed Central Google Scholar

Wang, Z. C. et al. Profiles of genomic instability in high-grade serous ovarian cancer predict treatment outcome. Clin. Cancer Res. 18, 5806–5815 (2012).

Article ADS CAS PubMed PubMed Central Google Scholar

Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).

Article PubMed PubMed Central Google Scholar

Shih, D. J. H. et al. Genomic characterization of human brain metastases identifies drivers of metastatic lung adenocarcinoma. Nat. Genet. 52, 371–377 (2020).

Article CAS PubMed PubMed Central Google Scholar

Tate, J. G. et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 47, D941–D947 (2019).

Article CAS PubMed Google Scholar

Garcia-Murillas, I. et al. Assessment of molecular relapse detection in early-stage breast cancer. JAMA Oncol. 5, 1473–1478 (2019).

Article PubMed PubMed Central Google Scholar

Frankell, A. M. et al. The evolution of lung cancer and impact of subclonal selection in TRACERx. Nature https://doi.org/10.1038/s41586-023-05783-5 (2023).

Litchfield, K. et al. Meta-analysis of tumor- and T cell-intrinsic mechanisms of sensitization to checkpoint inhibition. Cell 184, 596–614 (2021).

Article CAS PubMed PubMed Central Google Scholar

McGranahan, N. et al. Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade. Science 351, 1463–1469 (2016).

Article ADS CAS PubMed PubMed Central Google Scholar

Al Bakir, M. et al. The evolution of non-small lung cancer metastases in TRACERx. Nature https://doi.org/10.1038/s41586-023-05729-x (2023).

Martínez-Ruiz, C. et al. Genomic–transcriptomic evolution in lung cancer and metastasis. Nature https://doi.org/10.1038/s41586-023-05706-4 (2023).

Moding, E. J. et al. Circulating tumor DNA dynamics predict benefit from consolidation immunotherapy in locally advanced non-small cell lung cancer. Nat. Cancer 1, 176–183 (2020).

Article CAS PubMed PubMed Central Google Scholar

Chen, K. et al. Perioperative dynamic changes in circulating tumor DNA in patients with lung cancer (DYNAMIC). Clin. Cancer Res. 25, 7058–7067 (2019).

Article CAS PubMed Google Scholar

Li, N. et al. Perioperative circulating tumor DNA as a potential prognostic marker for operable stage I to IIIA non–small cell lung cancer. Cancer 128, 708–718 (2021).

Article PubMed Google Scholar

Kurtz, D. M. et al. Enhanced detection of minimal residual disease by targeted sequencing of phased variants in circulating tumor DNA. Nat. Biotechnol. 39, 1537–1547 (2021).

Article CAS PubMed PubMed Central Google Scholar

Cohen, J. D. et al. Detection of low-frequency DNA variants by targeted sequencing of the Watson and Crick strands. Nat. Biotechnol. 39, 1220–1227 (2021).

Article CAS PubMed PubMed Central Google Scholar

Gydush, G. et al. Massively parallel enrichment of low-frequency alleles enables duplex sequencing at low depth. Nat. Biomed. Eng. 6, 257–266 (2022).

Article CAS PubMed PubMed Central Google Scholar

Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).

Article ADS PubMed PubMed Central Google Scholar

Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).

Article CAS PubMed PubMed Central Google Scholar

Roth, A. et al. PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11, 396–398 (2014).

Article CAS PubMed PubMed Central Google Scholar

Miller, C. A. et al. Visualizing tumor evolution with the fishplot package for R. BMC Genom. 17, 880 (2016).

Article Google Scholar

Frankell, A. M., Colliver, E., Mcgranahan, N. & Swanton, C. cloneMap: a R package to visualise clonal heterogeneity. Preprint at bioRxiv https://doi.org/10.1101/2022.07.26.501523 (2022).

Birkbak, N. J. & Mcgranahan, N. Cancer genome evolutionary trajectories in metastasis. Cancer Cell 37, 8–19 (2020).

Article CAS PubMed Google Scholar

Rosenthal, R. et al. Neoantigen-directed immune escape in lung cancer evolution. Nature 567, 479–485 (2019).

Article ADS CAS PubMed PubMed Central Google Scholar

Lai, Z. et al. VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Nucleic Acids Res. 44, e108 (2016).

Article PubMed PubMed Central Google Scholar

Signorell, A., Aho, K., Alfons, A., Anderegg, N. & Aragon, T. DescTools: tools for descriptive statistics. R package version 0.99 (2023).

Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

Article CAS PubMed Google Scholar

Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).

Article PubMed PubMed Central Google Scholar

Yu, G. & He, Q.-Y. ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization. Mol. Biosyst. 12, 477–479 (2016).

Article CAS PubMed Google Scholar

Kassambara, A. rstatix: pipe-friendly framework for basic statistical tests. R package version 0.7.1 (2022).

Kuznetsova, A., Brockhoff, P. B. & Christensen, R. H. B. lmerTest package: tests in linear mixed effects models. J. Stat. Softw. 82, 1–26 (2017).

Article Google Scholar

Sanchez-Vega, F. et al. Oncogenic signaling pathways in The Cancer Genome Atlas. Cell 173, 321–337 (2018).

Article CAS PubMed PubMed Central Google Scholar

Bielski, C. M. et al. Genome doubling shapes the evolution and prognosis of advanced cancers. Nat. Genet. 50, 1189–1195 (2018).

Article CAS PubMed PubMed Central Google Scholar

Chung, N. C., Miasojedow, B., Startek, M. & Gambin, A. Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data. BMC Bioinform. 20, 644 (2019).

Article Google Scholar

Larsson, J. eulerr: area-proportional Euler and Venn diagrams with ellipses. R package version 7.0.0 (2022).

Yu, G. ggplotify: convert plot to ‘grob’ or ‘ggplot’ object. R package version 0.1.0 (2021).

Therneau, T. M. survival: a package for survival analysis in R. R package version v.3.2-13 https://CRAN.R-project.org/package=survival (2021).

Wiesweg, M. survivalAnalysis: high-level interface for survival analysis and associated plots. R package version 0.3.0 https://CRAN.R-project.org/package=survivalAnalysis (2022).

Kassambara, A., Kosinski, M. & Biecek, P. survminer: drawing survival curves using ‘ggplot2’. R package version 0.4.9 https://CRAN.R-project.org/package=survminer (2021).

R Core Team. R: A Language and Environment for Statistical Computing https://www.R-project.org/ (R Foundation for Statistical Computing, 2021).

Wickham, H. et al. Welcome to the tidyverse. J. Open Source Softw. 4, 1686 (2019).

Article ADS Google Scholar

Dowle, M. et al. data.table: extension of ‘data.frame’. R package version 1.14.6 https://CRAN.R-project.org/package=data.table (2022).

Wickham, H. et al. readxl: read excel files. R package version 1.4.1 https://CRAN.R-project.org/package=readxl (2022).

Klik, M. fst: lightning fast serialization of data frames. R package version 0.9.8 https://CRAN.R-project.org/package=fst (2022).

Yaari, G., Bolen, C. R., Thakar, J. & Kleinstein, S. H. Quantitative set analysis for gene expression: a method to quantify gene set differential expression including gene-gene correlations. Nucleic Acids Res. 41, e170 (2013).

Article CAS PubMed PubMed Central Google Scholar

Turner, J. A., Bolen, C. R. & Blankenship, D. M. Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates. BMC Bioinform. 16, 272 (2015).

Article Google Scholar

Meng, H., Yaari, G., Bolen, C. R., Avey, S. & Kleinstein, S. H. Gene set meta-analysis with quantitative set analysis for gene expression (QuSAGE). PLoS Comput. Biol. 15, e1006899 (2019).

Article ADS PubMed PubMed Central Google Scholar

Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016).

Article CAS PubMed Google Scholar

Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).

Kassambara, A. ggpubr: ‘ggplot2’ based publication ready plots. R package version 3.3.5 https://CRAN.R-project.org/package=ggpubr (2020).

Slowikowski, K. ggrepel: automatically position non-overlapping text labels with ‘ggplot2’. R package version 0.9.2 https://CRAN.R-project.org/package=ggrepel (2022).

Clarke, E. ggbeeswarm: categorical scatter (violin point) plots. R package version 0.7.1 https://CRAN.R-project.org/package=ggbeeswarm (2022).

Wickham, H. et al. scales: scale functions for visualization. R package version 1.2.1 https://CRAN.R-project.org/package=scales (2022).

Pedersen, T. L. ggforce: accelerating ‘ggplot2’. R package version 0.4.1 https://CRAN.R-project.org/package=ggforce (2022).

Wilke, C. O. cowplot: streamlined plot theme and plot annotations for ‘ggplot2’. R package version 1.1.1 https://CRAN.R-project.org/package=cowplot (2020).

Lakatos, E. et al. LiquidCNA: tracking subclonal evolution from longitudinal liquid biopsies using somatic copy number alterations. iScience 24, 102889 (2021).

Article ADS CAS PubMed PubMed Central Google Scholar

Download references

The TRACERx study (ClinicalTrials.gov: NCT01888601) is sponsored by University College London (UCL/12/0279) and has been approved by an independent Research Ethics Committee (13/LO/1546). TRACERx is funded by Cancer Research UK (C11496/A17786) and is coordinated through the Cancer Research UK and UCL Cancer Trials Centre, which has a core grant from CRUK (C444/A15953). We thank the patients and relatives who participated in the TRACERx study, and all site personnel, investigators, funders and industry partners who supported the generation of the data within this study. In particular, we acknowledge the support of staff at the Scientific Computing, the Advanced Sequencing Facility and Experimental Histopathology departments at the Francis Crick Institute. We also thank J. Brock for help. This work was supported by the Cancer Research UK Lung Cancer Centre of Excellence and the CRUK City of London Centre Award (C7893/A26233). M.A.B. is supported by Cancer Research UK, the Rosetrees Trust and the Francis Crick Institute. N.J.B. is a fellow of the Lundbeck Foundation (R272-2017-4040) and acknowledges funding from Aarhus University Research Foundation (AUFF-E-2018-7-14) and the Novo Nordisk Foundation (NNF21OC0071483). A. Huebner is supported by Cancer Research UK. D.A.M. is supported by the Cancer Research UK Lung Cancer Centre of Excellence (C11496/A30025). T.B.K.W. is supported by the Francis Crick Institute, as well as the Marie Curie ITN Project PLOIDYNET (FP7-PEOPLE-2013, 607722), the Breast Cancer Research Foundation (BCRF), Royal Society Research Professorships Enhancement Award (RP/EA/180007) and the Foulkes Foundation. T.K. is supported by the JSPS Overseas Research Fellowships Program (202060447). C.M.-R. is supported by the Rosetrees (M630) and Wellcome trusts. E.L.L. receives funding from NovoNordisk Foundation (16584). C.T.H. has received funding from NIHR University College London Hospitals Biomedical Research Centre. M.J.-H. is a CRUK Career Establishment Awardee and has received funding from CRUK, the NIH National Cancer Institute, the IASLC International Lung Cancer Foundation, the Lung Cancer Research Foundation, the Rosetrees Trust, UKI NETs, NIHR and the NIHR UCLH Biomedical Research Centre. N.M. is a Sir Henry Dale Fellow, jointly funded by the Wellcome Trust and the Royal Society (grant no. 211179/Z/18/Z) and also receives funding from Cancer Research UK, Rosetrees and the NIHR BRC at University College London Hospitals and the CRUK University College London Experimental Cancer Medicine Centre. T.L.C. acknowledges funding support from the Howard Hughes Medical Institute, and the Radiation Oncology Institute; G.I.E. from Cancer Research UK (A29210) and the European Research Council Advanced Investigator Award (294851); H.J.W.L.A. from the NIH (NIH-USA U24CA194354, NIH-USA U01CA190234, NIH-USA U01CA209414 and NIH-USA R35CA22052) and the European Union–European Research Council (grant agreement no. 866504); and K.L. from the UK Medical Research Council (MR/V033077/1), the Rosetrees Trust and Cotswold Trust (A2437), the Royal Marsden Cancer Charity and Melanoma Research Alliance. BioRender was used in the generation of Fig. 4 and Extended Data Fig. 7. C.S. is a Royal Society Napier Research Professor (RSRP\R\210001); and is supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK (CC2041), the UK Medical Research Council (CC2041) and the Wellcome Trust (CC2041). For the purpose of Open Access, the author has applied a CC BY public copyright licence to any author accepted manuscript version arising from this submission. C.S. is funded by Cancer Research UK (TRACERx (C11496/A17786), PEACE (C416/A21999) and CRUK Cancer Immunotherapy Catalyst Network); Cancer Research UK Lung Cancer Centre of Excellence (C11496/A30025); the Rosetrees Trust, Butterfield and Stoneygate Trusts; the NovoNordisk Foundation (ID16584); the Royal Society Professorship Enhancement Award (RP/EA/180007); the National Institute for Health Research (NIHR) University College London Hospitals Biomedical Research Centre; the Cancer Research UK–University College London Centre; the Experimental Cancer Medicine Centre; the Breast Cancer Research Foundation (US) BCRF-22-157; Cancer Research UK Early Detection an Diagnosis Primer Award (Grant EDDPMA-Nov21/100034); and The Mark Foundation for Cancer Research Aspire Award (Grant 21-029-ASP). This work was supported by a Stand Up to Cancer‐LUNGevity-American Lung Association Lung Cancer Interception Dream Team Translational Research grant (grant no. SU2C-AACR-DT23-17 to S. M. Dubinett and A. E. Spira). Stand Up To Cancer is a division of the Entertainment Industry Foundation. Research grants are administered by the American Association for Cancer Research, the Scientific Partner of SU2C. C.S. is in receipt of an ERC Advanced Grant (PROTEUS) from the European Research Council under the European Union's Horizon 2020 research and innovation programme (grant agreement no. 835297).

These authors contributed equally: Christopher Abbosh, Alexander M. Frankell, Thomas Harrison, Judit Kisistok, Aaron Garnett

These authors jointly supervised this work: Nicolai J Birkbak, Nicholas McGranahan, Charles Swanton

Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK

Christopher Abbosh, Alexander M. Frankell, Selvaraju Veeriah, Sophia Ward, Kristiana Grigoriadis, Kevin Litchfield, Clare Puttick, Dhruva Biswas, Takahiro Karasaki, James R. M. Black, Carlos Martínez-Ruiz, Maise Al Bakir, Emilia L. Lim, Ariana Huebner, David A. Moore, Elizabeth Manzano, Crispin T. Hiley, Mariam Jamal-Hanjani, Abigail Bunkum, Antonia Toncheva, Corentin Richard, Cristina Naceur-Lombardelli, Foteini Athanasopoulou, Francisco Gimeno-Valiente, Haoran Zhai, Jie Min Lam, Kerstin Thol, Krupa Thakkar, Mariana Werner Sunderland, Michelle Dietzen, Michelle Leung, Monica Sivakumar, Nnennaya Kanu, Olivia Lucas, Othman Al-Sawaf, Paulina Prymas, Robert Bentham, Sadegh Saghafinia, Sergio A. Quezada, Sharon Vanloo, Simone Zaccaria, Sonya Hessey, Wing Kin Liu, Martin D. Forster, Siow Ming Lee, Nicolai J. Birkbak, Nicholas McGranahan & Charles Swanton

Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK

Alexander M. Frankell, Sophia Ward, Kristiana Grigoriadis, Clare Puttick, Dhruva Biswas, Takahiro Karasaki, Maise Al Bakir, Oriol Pich, Thomas B. K. Watkins, Emilia L. Lim, Ariana Huebner, David A. Moore, Crispin T. Hiley, Daniel E. Cook, Gareth A. Wilson, Rachel Rosenthal, Andrew Rowan, Brittany B. Campbell, Chris Bailey, Claudia Lee, Emma Colliver, Foteini Athanasopoulou, Haoran Zhai, Jayant K. Rane, Katey S. S. Enfield, Mark S. Hill, Michelle Dietzen, Michelle Leung, Mihaela Angelova, Olivia Lucas, Othman Al-Sawaf, Nicolai J. Birkbak & Charles Swanton

Invitae, San Francisco, CA, USA

Thomas Harrison, Aaron Garnett, Laura Johnson, Mike Moreau, Adrian Chesh, Morgan R. Schroeder, Aamir Shahpurwalla, Aaron Odell, Paula Roberts, Robert D. Daber, Abel Licon & Josh Stahl

Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark

Judit Kisistok, Mateo Sokac & Nicolai J. Birkbak

Department of Clinical Medicine, Aarhus University, Aarhus, Denmark

Judit Kisistok, Mateo Sokac & Nicolai J. Birkbak

Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark

Judit Kisistok, Mateo Sokac & Nicolai J. Birkbak

Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA

Tafadzwa L. Chaunzwa, Jakob Weiss & Hugo J. W. L. Aerts

Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA

Tafadzwa L. Chaunzwa, Jakob Weiss & Hugo J. W. L. Aerts

Department of Radiology, Freiburg University Hospital, Freiburg, Germany

Jakob Weiss

Advanced Sequencing Facility, The Francis Crick Institute, London, UK

Sophia Ward, Foteini Athanasopoulou & Jerome Nicod

Cancer Genome Evolution Research Group, Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK

Kristiana Grigoriadis, Clare Puttick, James R. M. Black, Carlos Martínez-Ruiz, Ariana Huebner, Kerstin Thol, Michelle Dietzen, Michelle Leung, Robert Bentham & Nicholas McGranahan

Tumour Immunogenomics and Immunosurveillance Laboratory, University College London Cancer Institute, London, UK

Kevin Litchfield

Bill Lyons Informatics Centre, University College London Cancer Institute, London, UK

Dhruva Biswas & Javier Herrero

Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK

Takahiro Karasaki, Mariam Jamal-Hanjani, Abigail Bunkum, Jie Min Lam, Othman Al-Sawaf, Sonya Hessey & Wing Kin Liu

Department of Cellular Pathology, University College London Hospitals, London, UK

David A. Moore, Teresa Marafioti, Elaine Borg, Mary Falzon & Reena Khiroya

AstraZeneca, Cambridge, UK

Nadia Godin-Heymann, Anne L’Hernault, Hannah Bye & Darren Hodgson

The Christie NHS Foundation Trust, Manchester, UK

Fabio Gomes, Kate Brown, Mathew Carter, Anshuman Chaturvedi, Lynsey Priest & Pedro Oliveira

University Hospital Birmingham NHS Foundation Trust, Birmingham, UK

Akshay J. Patel, Aya Osman, Christer Lacson, Gerald Langman, Helen Shackleford, Madava Djearaman, Salma Kadiri & Gary Middleton

Cancer Research Centre, University of Leicester, Leicester, UK

Nicolas Carey, Joan Riley, Jacqui A. Shaw, Gurdeep Matharu & Lindsay Primrose

AstraZeneca, Waltham, MA, USA

Daniel Stetson & J. Carl Barrett

Department of Biochemistry, University of Cambridge, Cambridge, UK

Roderik M. Kortlever & Gerard I. Evan

Cancer Research UK & UCL Cancer Trials Centre, London, UK

Allan Hackshaw, Abigail Sharp, Sean Smith, Nicole Gower, Harjot Kaur Dhanda, Kitty Chan, Camilla Pilotti & Rachel Leslie

Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Maastricht, The Netherlands

Hugo J. W. L. Aerts

Department of Oncology, University College London Hospitals, London, UK

Mariam Jamal-Hanjani, Sarah Benafif, Jie Min Lam, Olivia Lucas, Martin D. Forster, Siow Ming Lee, Dionysis Papadatos-Pastos, James Wilson, Tanya Ahmad & Charles Swanton

Singleton Hospital, Swansea Bay University Health Board, Swansea, UK

Jason F. Lester

University Hospitals of Leicester NHS Trust, Leicester, UK

Amrita Bajaj, Apostolos Nakas, Azmina Sodha-Ramdeen, Keng Ang, Mohamad Tufail, Mohammed Fiyaz Chowdhry, Molly Scotland, Rebecca Boyles, Sridhar Rathinam & Dean A. Fennell

University of Leicester, Leicester, UK

Claire Wilson, Domenic Marrone, Sean Dulloo & Dean A. Fennell

Royal Free Hospital, Royal Free London NHS Foundation Trust, London, UK

Ekaterini Boleti

Aberdeen Royal Infirmary NHS Grampian, Aberdeen, UK

Heather Cheyne, Mohammed Khalil, Shirley Richardson & Tracey Cruickshank

Department of Medical Oncology, Aberdeen Royal Infirmary NHS Grampian, Aberdeen, UK

Gillian Price

University of Aberdeen, Aberdeen, UK

Gillian Price & Keith M. Kerr

Department of Pathology, Aberdeen Royal Infirmary NHS Grampian, Aberdeen, UK

Keith M. Kerr

The Whittington Hospital NHS Trust, London, UK

Kayleigh Gilbert

Birmingham Acute Care Research Group, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK

Babu Naidu

Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK

Gary Middleton

Manchester Cancer Research Centre Biobank, Manchester, UK

Angela Leek, Jack Davies Hodgkinson & Nicola Totten

Wythenshawe Hospital, Manchester University NHS Foundation Trust, Wythenshawe, UK

Angeles Montero, Elaine Smith, Eustace Fontaine, Felice Granato, Helen Doran, Juliette Novasio, Kendadai Rammohan, Leena Joseph, Paul Bishop, Rajesh Shah, Stuart Moss, Vijay Joshi & Philip Crosbie

Division of Infection, Immunity and Respiratory Medicine, University of Manchester, Manchester, UK

Philip Crosbie

Cancer Research UK Lung Cancer Centre of Excellence, University of Manchester, Manchester, UK

Philip Crosbie, Anshuman Chaturvedi, Lynsey Priest, Pedro Oliveira, Alexandra Clipson, Jonathan Tugwood, Alastair Kerr, Dominic G. Rothwell, Elaine Kilgour & Caroline Dive

Division of Cancer Sciences, The University of Manchester and The Christie NHS Foundation Trust, Manchester, UK

Colin R. Lindsay, Fiona H. Blackhall, Matthew G. Krebs & Yvonne Summers

Cancer Research UK Manchester Institute Cancer Biomarker Centre, University of Manchester, Manchester, UK

Alexandra Clipson, Jonathan Tugwood, Alastair Kerr, Dominic G. Rothwell, Elaine Kilgour & Caroline Dive

Institute for Computational Cancer Biology, Center for Integrated Oncology (CIO), Cancer Research Center Cologne Essen (CCCE), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany

Roland F. Schwarz

Berlin Institute for the Foundations of Learning and Data (BIFOLD), Berlin, Germany

Roland F. Schwarz & Tom L. Kaufmann

Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany

Tom L. Kaufmann

Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA

Peter Van Loo

Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA

Peter Van Loo

Cancer Genomics Laboratory, The Francis Crick Institute, London, UK

Peter Van Loo, Jonas Demeulemeester, Carla Castignani & Elizabeth Larose Cadieux

Danish Cancer Society Research Center, Copenhagen, Denmark

Zoltan Szallasi & Miklos Diossy

Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA

Zoltan Szallasi & Miklos Diossy

Department of Bioinformatics, Semmelweis University, Budapest, Hungary

Zoltan Szallasi

Department of Pathology, ZAS Hospitals, Antwerp, Belgium

Roberto Salgado

Division of Research, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia

Roberto Salgado

Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary

Miklos Diossy

Integrative Cancer Genomics Laboratory, Department of Oncology, KU Leuven, Leuven, Belgium

Jonas Demeulemeester

VIB–KU Leuven Center for Cancer Biology, Leuven, Belgium

Jonas Demeulemeester

Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK

Abigail Bunkum, Olivia Lucas, Simone Zaccaria & Sonya Hessey

The Francis Crick Institute, London, UK

Aengus Stewart, Alastair Magness, Clare E. Weeden, Dina Levi, Eva Grönroos, Jacki Goldman, Mickael Escudero, Philip Hobson, Roberto Vendramin, Stefan Boeing, Tamara Denner, Vittorio Barbè, Wei-Ting Lu, William Hill, Yutaka Naito & Zoe Ramsden

University College London Cancer Institute, London, UK

Angeliki Karamani, Benny Chain, David R. Pearce, Despoina Karagianni, Elena Hoxha, Felip Gálvez-Cancino, Georgia Stavrou, Gerasimos Mastrokalos, Helen L. Lowe, Ignacio Matos, James L. Reading, Jayant K. Rane, John A. Hartley, Kayalvizhi Selvaraju, Kezhong Chen, Leah Ensell, Mansi Shah, Marcos Vasquez, Maria Litovchenko, Olga Chervova, Piotr Pawlik, Robert E. Hynds, Saioa López, Samuel Gamble, Seng Kuong Anakin Ung, Supreet Kaur Bola, Thanos P. Mourikis, Victoria Spanswick & Yin Wu

Medical Genomics, University College London Cancer Institute, London, UK

Carla Castignani, Elizabeth Larose Cadieux, Miljana Tanić & Stephan Beck

Experimental Histopathology, The Francis Crick Institute, London, UK

Emma Nye & Richard Kevin Stone

Retroviral Immunology Group, The Francis Crick Institute, London, UK

George Kassiotis & Kevin W. Ng

Department of Infectious Disease, Faculty of Medicine, Imperial College London, London, UK

George Kassiotis

Department of Haematology, University College London Hospitals, London, UK

Karl S. Peggs

Cancer Immunology Unit, Research Department of Haematology, University College London Cancer Institute, London, UK

Karl S. Peggs

Department of Molecular Oncology and Immunology, The Netherlands Cancer Institute, Amsterdam, The Netherlands

Krijn Dijkstra

Oncode Institute, Utrecht, The Netherlands

Krijn Dijkstra

Experimental Oncology, Institute for Oncology and Radiology of Serbia, Belgrade, Serbia

Miljana Tanić

Immune Regulation and Tumour Immunotherapy Group, Cancer Immunology Unit, Research Department of Haematology, University College London Cancer Institute, London, UK

Sergio A. Quezada

Centre for Medical Image Computing, Department of Medical Physics and Biomedical Engineering, University College London, London, UK

Catarina Veiga

Department of Medical Physics and Bioengineering, University College London Cancer Institute, London, UK

Gary Royle

Department of Medical Physics and Biomedical Engineering, University College London, London, UK

Charles-Antoine Collins-Fekete

Institute of Nuclear Medicine, Division of Medicine, University College London, London, UK

Francesco Fraioli

Institute of Structural and Molecular Biology, University College London, London, UK

Paul Ashford

University College London, London, UK

Tristan Clark

Department of Radiology, University College London Hospitals, London, UK

Alexander James Procter, Asia Ahmed, Magali N. Taylor & Arjun Nair

UCL Respiratory, Department of Medicine, University College London, London, UK

Arjun Nair

Department of Thoracic Surgery, University College London Hospital NHS Trust, London, UK

David Lawrence & Davide Patrini

Lungs for Living Research Centre, UCL Respiratory, University College London, London, UK

Neal Navani, Ricky M. Thakrar & Sam M. Janes

Department of Thoracic Medicine, University College London Hospitals, London, UK

Neal Navani & Ricky M. Thakrar

University College London Hospitals, London, UK

Emilie Martinoni Hoogenboom, Fleur Monk, James W. Holding, Junaid Choudhary, Kunal Bhakhri, Marco Scarci, Martin Hayward, Nikolaos Panagiotopoulos, Pat Gorman, Robert CM. Stephens, Yien Ning Sophia Wong & Steve Bandula

The Institute of Cancer Research, London, UK

Anca Grapa, Hanyun Zhang, Khalid AbdulJabbar & Xiaoxi Pan

The University of Texas MD Anderson Cancer Center, Houston, TX, USA

Yinyin Yuan

Independent Cancer Patients’ Voice, London, UK

David Chuter & Mairead MacKenzie

University Hospital Southampton NHS Foundation Trust, Southampton, UK

Serena Chee, Aiman Alzetani, Lydia Scarlett, Jennifer Richards, Papawadee Ingram & Silvia Austin

Department of Oncology, University Hospital Southampton NHS Foundation Trust, Southampton, UK

Judith Cave

Academic Division of Thoracic Surgery, Imperial College London, London, UK

Eric Lim

Royal Brompton and Harefield Hospitals, Guy's and St Thomas’ NHS Foundation Trust, London, UK

Eric Lim, Paulo De Sousa, Simon Jordan, Alexandra Rice, Hilgardt Raubenheimer, Harshil Bhayani, Lyn Ambrose, Anand Devaraj, Hema Chavan, Sofina Begum, Silviu I. Buderi, Daniel Kaniu, Mpho Malima, Sarah Booth, Nadia Fernandes, Pratibha Shah & Chiara Proli

Department of Histopathology, Royal Brompton and Harefield Hospitals, Guy's and St Thomas’ NHS Foundation Trust, London, UK

Andrew G. Nicholson

National Heart and Lung Institute, Imperial College London, London, UK

Andrew G. Nicholson

Royal Surrey Hospital, Royal Surrey Hospitals NHS Foundation Trust, Guilford, UK

Madeleine Hewish

University of Surrey, Guilford, UK

Madeleine Hewish

Sheffield Teaching Hospitals NHS Foundation Trust, Sheffield, UK

Sarah Danson

Liverpool Heart and Chest Hospital, Liverpool, UK

Michael J. Shackcloth

Princess Alexandra Hospital, The Princess Alexandra Hospital NHS Trust, Harlow, UK

Lily Robinson & Peter Russell

School of Cancer Sciences, University of Glasgow, Glasgow, UK

Kevin G. Blyth & John Le Quesne

Cancer Research UK Beatson Institute, Glasgow, UK

Kevin G. Blyth & John Le Quesne

Queen Elizabeth University Hospital, Glasgow, UK

Kevin G. Blyth

NHS Greater Glasgow and Clyde, Glasgow, UK

Craig Dick

Pathology Department, Queen Elizabeth University Hospital, NHS Greater Glasgow and Clyde, Glasgow, UK

John Le Quesne

Golden Jubilee National Hospital, Clydebank, UK

Alan Kirk, Mo Asif, Rocco Bilancia, Nikos Kostoulas & Mathew Thomas

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

C.A., A.M.F., N.J.B., N.M. and C.S. co-wrote the manuscript. C.A., A.M.F., N.J.B., J.S. and C.S. conceived the study design. C.A., A.M.F., J.K., K.G., C.P., D.B., T.L.C., J.W., C.M.-R., M.A.B., O.P., T.B.K.W., E.L.L., A. Huebner, D.A.M., R. Salgado., F.G., A.J.P., E.M., D.E.C., C.T.H., M.J.-H. and N.J.B. integrated clinicopathological data, transcriptomic data, exome data and ctDNA data. C.A., T.H., A.G., A.L., J.S., M.R.S., K.L., L.J., C.P. and C.S. worked to develop and validate the MRD calling algorithm used in this manuscript. A.M.F. developed ECLIPSE and performed analyses of clonal composition used in this manuscript. A.G., M.M., A.C., L.J., P.R. and R.D.D. conducted AMP NGS experimental work for ctDNA data. K.G. performed GISTIC copy-number analysis. S.V., S.W., N.C., J.R., R.D.D., M.M., A.C. and J.A.S. provided oversight of TRACERx patient sample storage and/or DNA extraction and/or sequencing of patient samples. T.L.C., J.W. and H.J.W.L.A. performed radiomic analyses of baseline CT scans. T.H., M.R.S., A.G., A.S., A.O. and A.L. conducted ArcherDx variant selection, PSP design and informatic processing of AMP data. A.M.F., K.G., M.A.B., O.P., T.B.K.W., E.L.L., A. Huebner, D.E.C. and N.M. conducted multiregion sequencing and phylogenetic tree analyses and identified TRACERx variants for PSP design. D.A.M. conducted the pathological review. A. L’Hernault, A.G., L.H., P.R., H.B. and N.G.-H. designed and conducted analytical validation experiments of the AMP MRD assay. C.A. and T.H. designed and conducted in silico specificity experiments for the AMP assay. D.B. and N.J.B. conducted ORACLE analyses. C.A. and T.K. conducted reviews of radiological imaging reports. R.M.K., D.H., D.S., G.I.E. and J.C.B. gave advice on analyses performed in this paper. M.J.-H., J.A.S. and C.S. designed the study protocols. A. Hackshaw gave statistical advice. C.A., N.M., M.J.-H., N.J.B. and C.S. provided overall study oversight. All of the authors approved the final version of the manuscript.

Correspondence to Christopher Abbosh, Nicholas McGranahan or Charles Swanton.

C.A. has received speaking honoraria or expenses from AstraZeneca and Bristol-Myers Squibb and reports employment at AstraZeneca. C.A. and C.S. are listed as inventors on a European patent application relating to assay technology to detect tumour recurrence (PCT/GB2017/053289). This patent has been licensed to commercial entities and, under their terms of employment, C.A and C.S are due a revenue share of any revenue generated from such license(s). C.A. and C.S. declare a patent application (PCT/US2017/028013) for methods to detect lung cancer. A.M.F., C.A. and C.S. are named inventors on a patent application to determine methods and systems for tumour monitoring (PCT/EP2022/077987). C.A., C.S., K.L., C.P., T.H., L.J., M.R.S., A.G. and A. Licon are named inventors on a provisional patent protection related to a ctDNA detection algorithm. S.V. is listed as a co-inventor on a patent of methods for detecting molecules in a sample (US patent, 10,578,620). T.H., A.G., M.M., A.C., A.S., A.O., L.J., P.R., M.R.S., R.D.D., A.L. and J.S. are former or current employees of Invitae or ArcherDx and report stock ownership. D.B. reports personal fees from NanoString and AstraZeneca and has a patent (PCT/GB2020/050221) application on methods for cancer prognostication. M.A.B. has consulted for Achilles Therapeutics. D.A.M. reports speaker fees from AstraZeneca, Eli Lilly and Takeda; consultancy fees from AstraZeneca, Thermo Fisher Scientific, Takeda, Amgen, Janssen, MIM Software, Bristol-Myers Squibb and Eli Lilly; and has received educational support from Takeda and Amgen. N.G.-H., A. L’Hernault, H.B., D.H., D.S. and J.C.B. report stock ownership and employment at AstraZeneca. A. Hackshaw has received fees for being a member of independent data monitoring committees for Roche-sponsored clinical trials, and academic projects co-ordinated by Roche. C.T.H. has received speaker fees from AstraZeneca. M.J.-H. has consulted for, and is a member of, the Achilles Therapeutics scientific advisory board and steering committee; has received speaker honoraria from Pfizer, Astex Pharmaceuticals, Oslo Cancer Cluster; and is listed as a co-inventor on a European patent application relating to methods to detect lung cancer (PCT/US2017/028013). This patent has been licensed to commercial entities and, under terms of employment, M.J.-H. is due a share of any revenue generated from such license(s). N.J.B. is listed as a co-inventor on a patent to identify responders to cancer treatment (PCT/GB2018/051912), has a patent application (PCT/GB2020/050221) on methods for cancer prognostication and a patent on methods for predicting anti-cancer response (US14/466,208). H.J.W.L.A. has received personal fees and stock from Onc.AI, Sphera and Love Health, and speaking honoraria from Bristol-Myers Squibb. K.L. has a patent (CA3068366A) on indel burden and CPI response pending and speaker fees from Roche tissue diagnostics and Ellipses Pharmaceuticals, research funding from CRUK TDL/Ono/LifeArc alliance, Genesis Therapeutics and consulting roles with Monopteros Therapeutics and Kynos Therapeutics (all outside of this work). N.M. has received consultancy fees and has stock options in Achilles Therapeutics; and holds European patents relating to targeting neoantigens (PCT/EP2016/059401), identifying patient response to immune checkpoint blockade (PCT/EP2016/071471), determining HLA LOH (PCT/GB2018/052004) and predicting survival rates of patients with cancer (PCT/GB2020/050221). C.S. acknowledges grant support from AstraZeneca, Boehringer-Ingelheim, Bristol-Myers Squibb, Pfizer, Roche-Ventana, Invitae (previously Archer Dx, collaboration in minimal residual disease sequencing technologies), Ono Pharmaceutical and Personalis; he is an AstraZeneca advisory board member and chief investigator for the AZ MeRmaiD 1 and 2 clinical trials and is also co-chief investigator of the NHS Galleri trial funded by GRAIL and a paid member of GRAIL's scientific advisory board. He receives consultant fees from Achilles Therapeutics (also a member of the scientific advisory board), Bicycle Therapeutics (also a member of the scientific advisory board), Genentech, Medicxi, Roche Innovation Centre–Shanghai, Metabomed (until July 2022) and the Sarah Cannon Research Institute; has received honoraria from Amgen, AstraZeneca, Pfizer, Novartis, GlaxoSmithKline, MSD, Bristol-Myers Squibb, Illumina and Roche-Ventana; had stock options in Apogen Biotechnologies and GRAIL until June 2021, and currently has stock options in Epic Bioscience, Bicycle Therapeutics, and has stock options and is co-founder of Achilles Therapeutics; and holds additional patent applications related to targeting neoantigens (PCT/EP2016/059401), identifying patient response to immune checkpoint blockade (PCT/EP2016/071471), determining HLA LOH (PCT/GB2018/052004), predicting survival rates of patients with cancer (PCT/GB2020/050221), identifying patients who respond to cancer treatment (PCT/GB2018/051912) and both a European and US patent application related to identifying insertion/deletion mutation targets (PCT/GB2018/051892).

Nature thanks Aadel Chaudhuri and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A. Stacked bar plot of patient specific panels (PSPs) designed from primary tumour sequencing data showing the number of clonal (dark red) and subclonal (light red) variants per panel. Variants lacking clonality information are displayed in grey (median of 3 variants per patient [1-20], these mutations are either no longer called by TRACERx or called by ArcherDx but not TRACERx, see methods). A median of 126 clonal variants (range 21 to 195) and 64 subclonal variants (range 0 to 174) were tracked by the PSPs. Clonality was determined by PyClone analyses of multi-region exome data derived from primary resections of NSCLC (methods), in the absence of PyClone data, variants present in all multi-region sequenced tumour samples were called clonal. B. Violin plot demonstrating the % of subclonal clusters derived from multi-region tumour exome data tracked by PSPs on a per patient basis. A median of 88% of the subclonal mutation clusters present in each patient's multi-region exome derived phylogenetic tree were tracked [range 0-100]. 184 tumours with phylogenetic trees were included. C. Distribution of cfDNA input values for the cohort, median input of 23 ng, n = 1069 samples. Capping at 60 ng input was performed for some of the cohort explaining the peak at this value; for the remainder of the cohort, all cfDNA extracted was input into the assay (colours represent different cfDNA input categories as indicated). D. Histogram demonstrating the distribution of per-variant unique sequencing depth values across the cohort; unique depth refers to error-controlled depth achieved across a position targeted by a PSP (at least 5 unique molecular identifier (UMI) matched reads required to create a consensus error-controlled read, see methods). The median unique depth per-variant tracked by a PSP was 2226x (range 0 to 53789x, n = 201910). E. Correlation between cfDNA input (ng, Y axis) into the assay and the median UMI-corrected depth achieved across a PSP across 1069 plasma timepoints (X axis). Spearman's R value = 0.63 and two-sided P value < 2.2e-16. F. Association between median deduplication ratio achieved in a sample (Y-axis) and cfDNA input into the assay (ng, X-axis); duplication ratio refers to the median number of duplicate UMI-supported reads within a read family. Resequencing of samples where the median duplication ratio was less than 10 was performed where possible to maximize recoverable information from cfDNA samples, given that 5 UMI-supported reads are required to make a UMI family. 17 of 1069 evaluated cfDNA samples exhibited a final median deduplication ratio less than 5 (corresponds to the horizontal line on the plot). Colours correspond to different cfDNA input categories and match panel c. G-H. Boxplots demonstrating the error rates (%, Y axis) per each of 96 mutation trinucleotide contexts (X axis, 192 mutation trinucleotide contexts [TNCs] simplified to 96 reverse-complement identical mutation types), plots divided by transition event (G) and transversion event (H). Background position data from n = 1069 cell-free DNA libraries utilized to generate plots, variants predicted to exhibit low background error rates from pilot data analyses were prioritized for PSP design. Hinges correspond to first and third quartiles, whiskers extend to the largest/smallest value no further than 1.5x the interquartile range. Centre lines represent medians.

A-D. Pre- and postoperative MRD caller P values (Y axis MRD caller P value, one-sided Poisson test, see Methods) observed in pilot-phase of the project. X axis displays clonal ctDNA levels. A. Postoperative samples from n = 5 patients who did not have recurrence of their NSCLC; all n = 55 patient samples had caller P values in excess of P > 0.1 threshold meaning that they were deemed negative for ctDNA. B. Postoperative caller P values observed in n = 5 patients who had relapse of their NSCLC. 1 of 13 calls was made between caller P values of 0.1 and 0.01, the remaining 12 calls were made at a caller P value less than 0.01. C. Preoperative ctDNA calls from pilot cohort; 7 patients had positive ctDNA in plasma prior to surgery, all calls were made at caller P values < 0.01. D. In-silico simulation analysis to assess MRD caller specificity. 3157 mock MRD panels were generated within the evaluable pilot patient libraries and MRD caller P values were assessed. At a caller P value < 0.1 threshold, 121/3157 simulated mock panels were ctDNA positive (in-silico specificity of 96.2%); at a caller P value threshold < 0.01, 22/3157 simulated mock panels were ctDNA positive (in-silico specificity of 99.3%). E-F. Analytical validation of 50 variant MRD detection panels. E. Fragmented DNA with a known single nucleotide polymorphism (SNP) profile was spiked into a second background of fragmented DNA with a different SNP profile and a patient-specific panel targeted 50 alternate positions present in spiked-in DNA. 559 data points were generated across different DNA input quantities indicated, to establish the limit of detection plots. The Y axis and centre of the error bars demonstrate sensitivity (defined as the proportion of all repeats that resulted in MRD detection using a caller P value of 0.01). The confidence intervals on the plot are Clopper-Pearson confidence intervals (95% CIs). The X axis shows the quantity of variant germline DNA that was spiked into each repeat expressed as a percentage of total DNA in that sample. F. Circulating tumour DNA samples with high variant allele fractions were spiked into a different cell-free DNA background. Variant positions in ctDNA were targeted with a 50 variant panel; 100 data points were generated across the DNA input quantities indicated. Axes and error bars are the same as (E). G. Data from analyses of 48 blank samples donated by 24 healthy participants, caller P values are displayed. H. Barplots demonstrating the intended allele frequencies and the measured allele frequencies in the different spike-ins presented in part (E) and part (F) only data from variant DNA positive samples are presented. The colours of the barplot represent different DNA input masses as shown by the legend. The error bars on the plot represent the mean value of all positive spike-in samples +/− standard deviation of the values. Where the error bar is absent, this is because at this spike-in level and DNA input mass, only one positive sample was observed. Where the error bar led to an observed mean AF less than 0, the error bar was stopped at 0 for visualization purposes (the 0.05% spike-in, 2 ng input mass case). The horizontal dashed lines correspond to 0.1%, 0.05%, and 0.01% spike-in categories. Each data point is represented on the plots by a circle. n = 369 variant DNA positive samples displayed in LOD1 barchart, n = 93 variant DNA positive samples displayed in LOD2 barchart. I. Comparison between the content of cell-free DNA input into ddPCR reactions (yellow) and AMP PCR reactions (blue). Hinges correspond to first and third quartiles, whiskers extend to the largest/smallest value no further than 1.5x the interquartile range. Centre lines represent medians. Each dot on the plot represents a data point, lines connect paired samples from the same patient. Significantly more cell-free DNA was input into ddPCR reactions (paired two-sided Wilcoxon-test P = 0.01366). J. Orthogonal comparison between ctDNA detection based on AMP panels used in TRACERx and ddPCR against a single clonal variant. ddPCR ctDNA positive call threshold was two mutant droplets (bottom table) and one mutant droplet (top table). Percentage positive agreement (PPA) and percentage negative agreement (NPA) using ddPCR as the comparator is displayed in the table. Two-sided Fisher's test P values are demonstrated under the cross tables. K. A 300 mutation patient-specific panel was designed and applied to 10 ng DNA samples containing spike-in variant levels from 0% to 0.1%. In silico sub-sampling of the 300 mutations was performed (3 x 200 mutation in silico panels, 3x 100 mutation in silico panels and 3x 50 mutation in silico panels, see methods) and sensitivities are categorized by the number of mutations targeted by the panel.

A Flow diagram demonstrating different cohorts analysed in this manuscript; the top part of the flow diagram shows the total number of plasma samples that were intended to be analysed (n = 1095 from 197 patients) which reduced to 1069 samples due to single nucleotide polymorphism mismatches between cfDNA and tissue exome data in 26 cases, suggesting sample swap. These samples were analysed in 3 main cohorts, the pilot cohort (left), the preoperative cohort (middle), and the postoperative cohort (right). The postoperative cohort was divided into different categories based on landmark evaluability (relating to samples donated within 120 days of surgery to enable a landmark ctDNA analysis). B. Heatmap demonstrating individual tumour-specific clonal ctDNA fractions in patients with synchronous primaries diagnosed at baseline. The annotation rows of the heatmap show the ctDNA call present in that sample across all variants interrogated by the MRD caller, the highest pathological TNM stage, the individual histology, and individual tumour volumes of the two synchronous tumours present at baseline (for this category, grey represents absent data or volume unevaluable). C. Boxplot demonstrating the difference in pack-year history across 187 preoperative ctDNA positive NSCLC patients and preoperative ctDNA negative NSCLC patients. Hinges correspond to first and third quartiles, whiskers extend to the largest/smallest value no further than 1.5x the interquartile range. Centre lines represent medians. P value represents a Wilcoxon rank sum test. D. Kaplan-Meier curves demonstrating freedom from recurrence outcomes in ctDNA high (dark red), ctDNA low (blue), and ctDNA negative (grey) single primary adenocarcinoma patients (left) and single primary non-adenocarcinoma patients (right). ctDNA high and low were categorized based on median clonal ctDNA levels across ctDNA positive cases and relate to above and below 0.16%. Log-rank P values are displayed on each plot. E. Multivariable Cox regression analyses of Overall Survival (OS) and Freedom From Recurrence (FFR, defined as recurrence only) in patients with single (non-synchronous) NSCLC; evaluating ctDNA detection status, pTNM stage (Tumour Node Metastasis pathological stage version 7, categories I, II or III), whether adjuvant therapy was administered, age, and log10-transformed unique sequencing depth as predictors in adenocarcinomas and non-adenocarcinomas separately. Unique sequencing depth was included to adjust for under sequenced samples, representing potential false negatives. n = 88 adenocarcinoma patients and n = 81 non-adenocarcinoma patients were analysed for FFR and OS. On the forest plots, the diamond represents the multivariable Hazard Ratio (HR) with error-bars corresponding to 95% confidence intervals (CI). Multivariable P values (p) are displayed on the plot alongside the number of patients in each category (N). Reference categories were ctDNA positive patients, pTNM stage I patients and patients given adjuvant therapy. The exact Cox regression P value for the Outcome: ctDNA -ve category in the FFR adenocarcinoma plot = 0.00022. F. Heatmap showing the site of relapse in recurrent adenocarcinoma cases divided by whether preoperative ctDNA was detected (dark red, right) or undetected (grey, left). Intrathoracic (mediastinum, locoregional, ipsilateral lung, distant lung – green colours) or extrathoracic (bone, brain, liver, adrenal, extrathoracic lymph nodes or other extrathoracic site – red colours) sites of relapse are shown (sites shown are metastatic sites diagnosed within 180 days of clinical relapse). Heatmap is annotated by Tumour Node Metastasis pathological version 7 stage. G. Kaplan-Meier curve demonstrating post-relapse survival in recurrent adenocarcinoma patients (n = 38) stratified by preoperative ctDNA positive (red) or preoperative ctDNA negative (grey). Log-rank P value is displayed on the plot.

A. Flow chart demonstrating patients available for volumetric analyses and reasons for exclusion. B. Histogram showing the number of NSCLC cases by volume, with ctDNA positive samples shown as red bars, and ctDNA negative samples shown as grey bars. n = 150 volume evaluable cases. C. Volume versus log10-transformed clonal ctDNA level correlation plot with each individual TRACERx case that was ctDNA positive as a point and coloured by adenocarcinoma status (dark red) and squamous or other histology (dark blue). Fitted line represents a linear model line categorized by tumour histology. Below the correlation plot is a table describing a linear multivariable model based on these data to predict log10-transformed clonal ctDNA levels based on tumour volume and histology (adenocarcinoma and squamous and other categories). P values represent linear model adjusted P values, n = 96 ctDNA positive, volume evaluable NSCLCs analysed. D. Based on a multivariable linear regression model fitted to the data in (C), we categorized ctDNA negative adenocarcinomas as biological low-shedders or technical non-shedders (see methods). If a particular tumour volume resulted in an estimated clonal mutation ctDNA level above the clonal ctDNA level a library could detect (95% lower confidence interval for estimated clonal ctDNA level based on tumour volume is above detectable clonal ctDNA level in the preoperative cfDNA library from that patient), then the case was classed as a probable biological low-shedder (red on histogram); otherwise, the case was classed as a probable technical non-shedder (turquoise on histogram). Y axis represents the lower 95% confidence estimate for clonal mutation ctDNA level divided by the minimally detectable clonal mutation ctDNA level (MDCL) for that patient's panel. The X axis is each individual patient analysed. Data from n = 47 ctDNA negative adenocarcinomas presented. E. Violin box-plots comparing tumour purity in ctDNA low-shedder adenocarcinomas (blue, n = 79 tumour regions from 28 patients) and ctDNA positive adenocarcinomas (red, n = 166 tumour and lymph node regions from 35 patients). Pairwise comparisons are performed using linear mixed-effects models, P values are two-sided. Boxplot hinges correspond to first and third quartiles, whiskers extend to the largest/smallest value no further than 1.5x the interquartile range and centre lines represent medians. Violins represent the distribution of the underlying data. F. Barplots showing gene-level driver alterations between ctDNA positive adenocarcinomas (n = 39 patients) and ctDNA negative low-shedder adenocarcinomas (n = 31 patients). Colours denote ctDNA detection status. Y axis shows the top 14 most frequently altered genes, X axis shows the percentage of patients carrying an alteration in the gene per detection category. NS: Not significant (two-sided Fisher's exact test with FDR P value adjustment). G. Pathway-level driver mutations between ctDNA positive adenocarcinomas (n = 39 patients) and ctDNA negative low-shedder adenocarcinomas (n = 31 patients). X axis shows patient IDs, Y axis shows pathways following the Sanchez-Vega definition. Top bar denotes ctDNA detection status (dark red represents ctDNA positives, blue represents biological low-shedders). Heatmap colours display mutations; blue denote clonal mutations and red denote subclonal mutations. No pathway showed significant enrichment in either ctDNA shedder or non-shedder adenocarcinomas (NS: Not significant, using two-sided Fisher's exact test with FDR P value adjustment). H. Whole genome doubling status per tumour comparing ctDNA positive adenocarcinomas to ctDNA negative low-shedder adenocarcinomas, using two-tailed Fisher's exact test. Yellow represents the number of tumours subjected to whole genome doubling in at least one region, turquoise represents tumours without any whole genome doublings. I. Volume by ctDNA shedding status. Biological non-shedders in red represent the smallest quartile samples. After removal of these from the analysis, no significant difference in tumour volume was found between ctDNA positives and ctDNA low-shedders. Pairwise comparisons are made with two–sided Wilcoxon rank sum tests. J. Venn diagram showing the overlap between significantly differentially expressed genes between ctDNA positive and ctDNA low shedder adenocarcinomas obtained from the full dataset, relative to the volume-adjusted dataset. Comparisons are made by computing the Jaccard similarity index and the corresponding two-sided P value using the exact method. K. Venn diagram showing the overlap between significantly altered cytobands as called by GISTIC, comparing ctDNA positive to ctDNA low shedder adenocarcinomas obtained from the full dataset, relative to the volume-adjusted dataset. Statistical testing follows (J).

A. Table demonstrating details of unexpected ctDNA positive results in patients who did not have disease recurrence. B. CRUK0498 false positive analysis: Dot-plots represent confidently detected variants at illustrated cfDNA sampling timepoints (left panel), variants confidently detected in normal tissue, control DNA, and peripheral-blood mononuclear cell (PBMC, buffy-coat) DNA based on application of CRUK0498's patient specific panel to these respective samples (middle panel) and the mutant allele frequencies of selected variants in tumour tissue exome data (right panel). The four variants in the legend (variants in genes ATP2C1, DDIT4L, EYS, and TUSC3) represent variants confidently called at 50% or more of the timepoints across the cfDNA samples (note that confidently called means an individual variant Poisson one-sided P value of <0.01 [generated by MRD caller, see methods]). C. A haematoxylin and eosin image from patient CRUK0498's tumour where exome analysis detected the variants in genes ATP2C1, DDIT4L, EYS and TUSC3 at high variant allele-frequencies. This image shows a dense lymphocyte aggregate in this tumour region. Scale bar below image. A single image was analysed. D. A further 19 preoperative PBMC samples were analysed from TRACERx patients; no confident panel-wide variant DNA calls were made in these patients’ PBMC samples using the MRD calling algorithm. E. Variant-level analyses of the preoperative PBMC samples analysed in panel (D) highlighted that 12 of 3621 variants interrogated by the panels were detected (variant level one-sided Poisson P value < 0.01). 8 of 12 detected variants were removed from the MRD caller algorithm in cell-free DNA analyses (cfDNA) due to triggering filters highlighted in the heatmap annotation. Only 2 of the 4 remaining variants carried deep alternate reads in the respective patients’ preoperative cfDNA sample (red arrows). The heatmap shows the cfDNA variant allele frequency and the WBC variant allele frequency of the detected variants (grey colour represents no detection of the variant). Two mistargeted germline variants are highlighted by black arrows for patient CRUK0296, variants were targeted in error by the industry panel design pipeline but not by the TRACERx exome pipeline (methods), and were filtered from the MRD calling algorithm due to triggering the outlier filter (dao imbalance filter, dark red).

A. Analysis of 13 patients who experienced intracranial relapse who were positive for ctDNA in a postoperative blood sample. The X axis shows the clonal ctDNA level at the point of postoperative ctDNA detection and the Y axis shows the day of postoperative ctDNA detection. Points are coloured based on whether the intracranial relapse was solitary (green), accompanied by another extracranial site (red), or unconfirmed solitary (blue, no extracranial imaging performed) and are shaped by landmark ctDNA status. B. Heatmap of clonal mutation ctDNA level data at first postoperative ctDNA detection. The annotation rows show the landmark ctDNA status of the patient (landmark positive, ctDNA detected within 120 days postoperatively; landmark negative, ctDNA negative within 120 days postoperatively; unevaluable, landmark status cannot be established), the day ctDNA was detected postoperatively, the histology of the primary tumour, and lead time (days from ctDNA detection to clinical relapse). Where lead time was not applicable (for example incompletely resected disease, ctDNA detected post-relapse, see methods) lead time is coloured grey. The next two rows (bar charts) demonstrate the number of clonal or subclonal mutations tracked by an AMP patient-specific panel (PSP); if the bar is blue, it represents confident detection of an individual variant (based on an individual variant P value of <0.01 [one sided Poisson test based on MRD caller output, see methods]), if the bar is black, it represents absence of confident calling of a variant, if the bar is red, it represents that a variant was filtered by the MRD calling algorithm. The final row represents the mean clonal ctDNA level at the first ctDNA detection time point for a patient. This is on a log-10 scale as displayed in the heatmap legend. For patient CRUK0296, ctDNA detection occurred but clonal ctDNA levels were 0% (grey bar) as the mutation driving ctDNA detection postoperatively did not have a clonal status. C Longitudinal per-patient plots in 12 patients who were ctDNA positive prior to adjuvant therapy. Plots are annotated with lead time (L-t), scans performed, and treatment administered (see legend). The Y axis represents clonal ctDNA levels and each circle on the plot represents a blood sampling time point. If the circle is red, it indicates that the blood sample was positive for ctDNA using the MRD caller. The X axis displays days post-surgery. D-E. Kaplan-Meier curves in the landmark evaluable population (patients who donated blood within 120 days post-surgery before treatment or clinical recurrence, n = 102/108 landmark evaluable patients were evaluable for survival analysis, see methods for exclusions) showing overall survival (OS,D) or freedom from recurrence (FFR,E) outcomes for landmark positive (dark red) versus landmark negative (grey) patients. Log-rank P values displayed on curves. F. Boxplots showing the distribution of lead times (times from ctDNA detection to clinical recurrence) categorized by patient landmark ctDNA status. Hinges correspond to first and third quartiles, whiskers extend to the largest/smallest value no further than 1.5x the interquartile range. Centre lines represent medians. Kruskal-Wallis test P = 0.0057, unadjusted pairwise Wilcoxon-tests compare individual categories, n = 63 patients analysed. G. Pie charts demonstrate the number of occurrences of specified ctDNA detection statuses (red – ctDNA negative, green – ctDNA positive, blue – no ctDNA status established), preceding a scan showing no new changes (left) or new equivocal extracranial changes (middle). The ctDNA positive and negative categories are then broken down further into a patient-level analysis showing the outcomes of patients who experienced the occurrence of the specified imaging and ctDNA status event(s). H. Barchart showing the count of specific equivocal anatomical sites noted on scans showing new equivocal changes; equivocal lung lesions and lymph nodes were the most common abnormal equivocal findings on NSCLC surveillance imaging. Multiple equivocal sites can be observed on one scan. I. Barplot of eventual site of relapse and ctDNA status in 33 patients with ctDNA status established prior to surveillance imaging, showing new equivocal lymph node enlargement. The X axis shows the patient ctDNA detection status preceding surveillance scans. The Y axis shows the patient count. Patient CRUK0090 exhibited occurrences of both negative and positive ctDNA statuses prior to separate equivocal lymphadenopathy scans, so is present in both ctDNA positive and negative categories. Other patients are only included once. Patient CRUK0234 was diagnosed with an unresected lymph node, was ctDNA negative postoperatively and included in the analysis. The barcharts are filled with recurrence status of patients in these categories. Recurred with LN refers to lymph node involvement at relapse (dark red colour). Recurred with no LN refers to recurrence with no lymph node involvement (green colour).

A. A conceptual overview of the ECLIPSE method and data input types. CCF; cancer cell fraction and VAF; variant allele fraction. The schematic was created using BioRender. B. Equation to calculate tumour purity (the % of cells from which the DNA was derived which are tumour cells, see supplementary note 1, also termed ‘cellularity’ or ‘aberrant cell fraction’) using clonal mutations. C. Equation to calculate cancer cell fraction (CCF). Multiplicity = the number of mutated DNA copies in each mutated cell, CNt = total copy number in the tumour, CNn = total copy number in normal (non-tumour) cells, VAF = variant allele fraction, P = tumour purity (the % of cells from which the DNA was derived which are tumour cells, see Supplementary Note 1). D. Percentage change in mean multiplicity of clonal mutations comparing measurements in surgical excised tissue samples to tissue samples taken at relapse (46 patients with paired primary and recurrence tissue samples plotted). E. A comparison between mean clonal VAF of mutations and ctDNA tumour purity as calculated by ECLIPSE where data points (plasma samples) are coloured by the average copy number of tracked clonal mutations (measured using tissue sequencing). Multi-tumour patients and samples with evidence of copy number of instability at relapse are excluded. A total of 322 samples from 134 patients are plotted.

A. Minimally detectable CCF for each ctDNA positive sample compared to clonal ctDNA levels for each sample. All ctDNA positive samples included (N = 354). Minimally detectable CCF was calculated using the minimum number of required reads for a positive (P < 0.01) clone detection call (methods). B. Minimally detectable CCF over time for each patient with a horizontal line indicating the threshold for high subclone sensitivity samples (20% CCF). All ctDNA positive samples included (N = 354). 61% of preoperative MRD positive samples were considered high subclone sensitivity and 66% of postoperative samples were considered of high subclone sensitivity (overall 64% of samples). C. A histogram of clonal ctDNA levels for all ctDNA positive samples (N = 354) with vertical lines indicating thresholds for ECLIPSE evaluability and for traditional clonal deconvolution evaluability used for TRACERx tissue samples28 and previous clonal deconvolution approaches in ctDNA14,77. D. A histogram of maximum clonal ctDNA levels observed in post-operative samples for each patient with vertical lines indicating thresholds for ECLIPSE evaluability and for traditional clonal deconvolution evaluability (see C). This is shown for 66 patients who relapsed with ctDNA positive postoperative plasma . E. Validation of ECLIPSE detection rates across varying subclonal mutation number, clonal ctDNA level, subclone cancer cell fraction and DNA input amount into the assay. Subclones were constructed using ground truth in vitro spike-in experiments with 10-12 technical replicates for each input mass-allele fraction combination. These ground truth mutant allele fractions were then mixed in silico to construct 76,263 subclones varying across these parameters. Data from these experimentally derived subclones were then run through ECLIPSE and subclone detection rates across each of these parameters depicted.

A. Correlation between cancer cell fractions (CCFs) as measured in preoperative plasma samples with phylogenetic data, >0.1% clonal ctDNA level & >=10 ng DNA input (high subclone sensitivity samples) with ECLIPSE and those measured with multi-region tissue sequencing (M-seq) at surgery (N = 71 patients and 684 subclones included). B. Copy number unaware CCFs calculated only using VAFs (methods) compared to tissue CCF from M-seq. All preoperative samples with phylogenetic data, >0.1% clonal ctDNA level & >=10 ng DNA input (high subclone sensitivity samples) were included (N = 71 patients and 684 subclones included). C. A scatter plot demonstrating the relationship between clonal ctDNA level and the proportion of multi-region tumour exome (M-seq) defined subclones detected by ECLIPSE based on varying subclonal cancer cell fractions as indicated, loess lines are fitted to the plots, n = 117 ctDNA positive preoperative samples. D. A comparison of preoperative plasma CCFs and the average CCFs across all tissue regions sampled at surgery for clones that were unique to one tumour tissue region and for clones that were distributed across more than two tumour tissue regions. N = 71 patients and 684 subclones included. A Wilcoxon-test was used to compare groups. E. A comparison of preoperative plasma CCFs and the average CCFs across all tissue regions sampled at surgery for clones that were unique to one tumour tissue region separated between small (<20 cm3), medium (>20 cm3 & <100 cm3), and large (>100 cm3) tumours as measured on preoperative PET/CT scans. N = 71 patients and 684 subclones included. A Wilcoxon-test was used to compare groups. F. A comparison of detection rates in preoperative plasma for 20% CCF subclones across a range of clonal ctDNA levels split by whether the subclones were spread across multiple primary tumour tissue regions or were limited to only a single primary tumour tissue region. 1924 subclones were assessed in 197 preoperative plasma samples. G. A map of tumour clones with areas of multi-regional tissue sampling indicated and clones which are over- and undersampled highlighted. Most of the undersampled clones are in fact not in the sampled areas creating a bias towards oversampling in clones which we are able to detect, an effect also called the ‘winner's curse’. H. A ROC curve describing the sensitivity and specificity of detecting clonal illusion mutations using plasma-based CCFs with 95% confidence intervals generated using bootstrapping across 500-fold cross-validation (N = 71 tumours).

A. An overview of clonal structure evaluability at relapse for TRACERx patients in our cohort (N = 75 tumours) using either cell-free DNA and ECLIPSE or relapse tissue and WES/PyClone. B. ctDNA detection status post-operatively of subclones split by detection status in metastatic tissue. Untracked subclones (those without any mutations included in the PSP panels) were excluded (N = 26 tumours). P value indicates the result from Fisher's exact test. C. Clonal (estimated as present in 100% of tumour cells) vs subclonal (estimated as present in <100% of cells) status at relapse of primary tumour subclones by whether they were detected in cfDNA and metastatic tissue or cfDNA alone (N = 26 tumours). P value indicates the result from a Fisher's exact test. D. Metastatic dissemination class determined by tissue and by cfDNA in 22 cases with a metastatic biopsy, a postoperative high subclone sensitivity plasma sample, and a phylogenetic tree constructed. E. Overall survival Kaplan-Meier plot demonstrating time from the first MRD positive timepoint to death stratified by ECLIPSE metastatic dissemination class at relapse (monoclonal: light blue, polyclonal polyphyletic: purple, and polyclonal monophyletic: green). HR: Hazard ratio, CI: confidence interval. 44 patients were included in this analysis. The P value indicates the result of a log-rank test. F. A multivariable Cox proportional hazards model to predict overall survival from the time of first MRD detection including the clonality of metastatic dissemination at relapse, stage, maximum postoperative clonal ctDNA level, average DNA assay input, histology, and whether the first plasma sample after surgery was ctDNA positive, including only relapse patients. 44 patients were included in this analysis. Error bars indicate 95% confidence intervals. G. The frequency of high confidence subclonal to clonal bottlenecks (methods) at the latest possible plasma sample time point with sufficient clonal ctDNA level (high sensitivity subclone samples, N = 44 tumours) and which of these subclones harbour subclonal neoantigens (NAGs) which therefore become clonal at relapse. H. In cases of clonal bottlenecking at relapse, the percentage increase in the number of clonal mutations is shown as a box and whisker plot with the absolute number of new clonal mutations (N = 18 tumours). I. In cases of clonal bottlenecking at relapse, the percentage increase in the number of clonal NAGs is shown as a box and whisker plot with the absolute number of new clonal NAGs (N = 18 tumours). NAG = Neoantigen.

Longitudinal subclonal analyses across all relapsing patients with available phylogenetic trees and at least one postoperative time point with high subclone sensitivity (n = 44 patients).

Supplementary Tables 1–21.

Legends for Supplementary Tables 1-21.

Reprints and Permissions

Abbosh, C., Frankell, A.M., Harrison, T. et al. Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA. Nature 616, 553–562 (2023). https://doi.org/10.1038/s41586-023-05776-4

Download citation

Received: 06 April 2022

Accepted: 30 January 2023

Published: 13 April 2023

Issue Date: 20 April 2023

DOI: https://doi.org/10.1038/s41586-023-05776-4

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Nature (2023)

Nature Medicine (2023)

Nature (2023)

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.