Webinars

MAQC Society 2022 Spring Webinar Series

  • Webinar replays will shortly be made available to MAQC Society members
  • Webinars focused on current NGS scientific articles from SEQC2
  • Webinars ran from March 22 – May 3, 2022 (each Tuesday at 11am ET except for April 26)
March 22 2022, 11AM eastern: X-CNV: Genome-wide prediction of the pathogenicity of copy number variations

Biography: Dr. Zhichao Liu is the technical leader for the Artificial Intelligence Research Force (AIRForce) at Division of Bioinformatics & Biostatistics, FDA/NCTR. Dr. Liu’s background spans the fields of chemistry, biology, and computer science. He led many cutting-edge projects in the past decade by designing, implementing, and deploying AI/machine learning solutions for advanced regulatory sciences. Specifically, Dr. Liu unleased the AI/machine learning solutions for improving the pathogenicity prediction of complex genetic variants, promoting predictive toxicology and facilitating precision medicine applications. His accomplishment was reflected by 5 FDA-wide Awards, 9 NCTR level Awards, 2 Scientific community level awards, and more than 80 peer-reviewed publications.

Abstract: Gene copy number variations (CNVs) contribute to genetic diversity and disease prevalence across populations. Substantial efforts have been made to decipher the relationship between CNVs and pathogenesis but with limited success.  We have developed a novel computational framework X-CNV (www.unimd.org/XCNV), to predict the pathogenicity of CNVs by integrating more than 30 informative features such as allele frequency (AF), CNV length, CNV type, and some deleterious scores. Notably, over 14 million CNVs across various ethnic groups, covering nearly 93% of the human genome, were unified to calculate the AF. X-CNV, which yielded area under curve (AUC) values of 0.96 and 0.94 in training and validation sets, was demonstrated to outperform other available tools in terms of CNV pathogenicity prediction. A meta-voting prediction (MVP) score was developed to quantitively measure the pathogenic effect, which is based on the probabilistic value generated from the XGBoost algorithm. The proposed MVP score demonstrated a high discriminative power in determining pathogenetic CNVs for inherited traits/diseases in different ethnic groups. The ability of the X-CNV framework to quantitatively prioritize functional, deleterious, and disease-causing CNV on a genome-wide basis outperformed current CNV-annotation tools and will have broad utility in population genetics, disease-association studies, and diagnostic screening.

Dr. Zhichao Liu, Ph. D.

Dr. Zhichao Liu, Ph. D.

Artificial Intelligence Research Force (AIRForce) Technical Leader

Dr. James C Willey, Ph.D.

Dr. James C Willey, Ph.D.

George Isaac Chair for Cancer Research

April 5 2022, 11am eastern: Advancing quality-control for NGS measurement of actionable mutations in circulating tumor DNA

Biography:  I am currently Professor of Medicine and Pathology, and George Isaac Chair for Cancer Research with more than 25 years of continuous funding from the National Institutes of Health in translational research.  We develop methods to optimize to quality control in next generation sequencing analysis of germline and somatic variants in clinical specimens, including circulating tumor DNA.  Since 2005, my laboratory has participated in United States Food and Drug Administration (FDA) MAQC collaborative projects to improve quality control in genetic testing.  Through these FDA consortia, as of 2021 I have co-authored twelve articles reporting results from these collaborative projects, including nine in Nature Biotechnology articles, three Genome Biology, and one in Cell Reports Methods.

Abstract: The primary objective of the FDA-led Sequencing and Quality Control Phase 2 (SEQC2) project is to develop standard analysis protocols and quality control metrics for use in DNA testing to enhance scientific research and precision medicine. This study reports a targeted next generation sequencing (NGS) method that enables more accurate detection of actionable mutations in circulating tumor DNA (ctDNA) clinical specimens. This advancement was enabled by designing a synthetic internal standard spike-in for each actionable mutation target, suitable for use in NGS following hybrid-capture enrichment and unique molecular index (UMI) or non-UMI library preparation. When mixed with contrived ctDNA reference samples, internal standards enabled calculation of technical error rate, limit of blank, and limit of detection for each variant at each nucleotide position, in each sample.  True positive mutations with variant allele fraction too low for detection by current practice were detected with this method, thereby increasing sensitivity.

April 12 2022, 11am eastern: Deep oncopanel sequencing reveals within block position-dependent quality degradation in FFPE processed samples

Biography: Dr. Xu is the Branch Chief for Research-to-Review (R2R) at the Division of Bioinformatics and Biostatistics of FDA’s National Center for Toxicological Research (NCTR). He specializes in genomics, big data, image analysis, and machine learning. His recent endeavor has been with the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project to evaluate the technical reliabilities and scientific applications of the next generation sequencing (NGS) technologies. He leads the Oncopanel Sequencing Working Group to assess the reproducibility and detection sensitivity of onco-panel sequencing including liquid biopsy. He is also the executive secretary for MAQC Society.

Abstract: Clinical laboratories routinely use formalin-fixed paraffin-embedded (FFPE) tissue or cell block cytology samples in oncology panel sequencing to identify mutations that can predict patient response to targeted therapy. To understand the technical error due to FFPE processing, a robustly characterized diploid cell line was used to create FFPE samples with four different pre-tissue processing formalin fixation times. A total of 96 FFPE sections were distributed to different laboratories for targeted sequencing analysis by four oncopanels, and variants resulting from technical error were identified. Tissue sections that failed more frequently showed low cellularity, lower than recommended library preparation DNA input, or target sequencing depth. Importantly, sections from block surfaces were more likely to show FFPE-specific errors, akin to “edge effects” seen in histology, while the inner samples displayed no quality degradation related to fixation time up to 24 hours.  To assure reliable results, we recommend avoiding the block surface portion and restricting mutation detection to genomic regions of high confidence.

Dr. Joshua Xu, Ph.D.

Dr. Joshua Xu, Ph.D.

Branch Chief for Research-to-Review — Division of Bioinformatics and Biostatistics, FDA-NCTR & Executive Secretary, MAQC Society

Dr. Timothy R. Mercer, Ph.D.

Dr. Timothy R. Mercer, Ph.D.

Group Leader, Institute of Bioengineering and Nanotechnology, University of Queensland, Brisbane, QLD, Garvan Institute of Medical Research, Sydney, NSW, Australia.

April 19 2022, 11am eastern: Resolving the complexity of the human genome using synthetic chromosomal controls & Building a synthetic genome that encodes DNA, mRNA and protein controls

Biography:  Timothy Mercer is a Group Leader at the Australian Institute for Bioengineering and Nanotechnology (AIBN) at The University of Queensland where he is the Scientific Director of the BASE nucleic-acid synthesis facility. He also leads a diverse laboratory and bioinformatic research group into the expression and splicing of synthetic genes.  Prior to this, he was Group Leader at the Garvan Institute for Medical Research, Sydney, where he pioneered the use of synthetic RNA and DNA controls to improve the accuracy of clinical genome sequencing. He also developed targeted RNA sequencing approaches for the diagnosis of fusion genes in cancer. Together, this reflects his ongoing interest in the development of genome biotechnologies.   Before joining the Garvan, Tim Mercer received his PhD in Genomics from UQ and completed postdoctoral studies in transcriptomics, long-noncoding RNAs and splicing at Broad Institute, United States, Centre for Gene Regulation, Spain, and Max Plank Institute for Cell Biology, Germany.

Abstract: 

Resolving the complexity of the human genome using synthetic chromosomal controls: Next-generation sequencing (NGS) can identify mutations in the human genome that cause disease and has been widely adopted in clinical diagnosis. However, the human genome contains many polymorphic, low complexity, and repetitive regions that are difficult to sequence and analyse. Despite their difficulty, these regions include many clinically-important sequences that can inform the treatment of human diseases and improve the diagnostic yield of NGS. To evaluate the accuracy by which these difficult regions are analysed with NGS, we built an in silico decoy chromosome, along with corresponding synthetic DNA reference controls, that encode difficult and clinically-important human genome regions, including repeats, microsatellites, HLA genes and immune-receptors. These synthetic chromosome controls provide a known ground-truth reference against which to measure the performance of diverse sequencing technologies, reagents, and bioinformatic tools. Using this approach, we provide a comprehensive evaluation of short- and long-read sequencing instruments, library preparation methods, and software tools, and identify the errors and systematic bias that confound our resolution of these remaining difficult regions. This study provides an analytical validation of diagnosis using NGS in difficult regions of the human genome and highlights the challenges that remain to resolve these difficult regions.

Building a synthetic genome that encodes DNA, mRNA and protein controls. PhiX-174 was the first genome to be sequenced in 1974, and has become the most commonly used standard in sequencing, molecular and synthetic biology. However, given the advent of affordable DNA synthesis and de novo gene design, we considered whether we could build a new genome, termed SynX, that is optimized for use as a molecularstandard. The SynX genome encodes synthetic genes that are organised into paralogous gene families and provide qualitative and quantitative evaluation of next-generation sequencing performance. The synthetic genes can be in vitrotranscribed to form matched synthetic mRNA controls to evaluate RNA sequencing performance. Finally, the synthetic mRNA controls can be in vitro translated to form a matched protein controls for high throughput proteomics methods, such as mass spectrophotometry. The SynX genome can be independently and sustainably prepared, modified and shared by recipient laboratories using common molecular biology techniques, and be widely used as a universal molecular standard.

May 3 2022, 11AM eastern: Hidden biases in germline structural variant detection

Biography: Dr. Fritz Sedlazeck completed his PhD in 2012 in the group of Dr. Arndt von Haeseler at the Max F. Perutz Laboratory in Vienna. After a two year postdoc, he transitioned to the lab of Dr. Michael Schatz at Cold Spring Harbor Laboratory and later to Johns Hopkins University. Since 2017 he leads his own group at the Human Genome Sequencing Centre at Baylor College of Medicine. Dr. Sedlazeck groups focuses on the mechanisms of the formation of SV across multiple species and to improve our understanding how these complex alleles evolve and impact phenotypes.

Abstract:  Genomic structural variations (SV) are important determinants of genotypic and phenotypic changes in many organisms. However, the detection of SV from next-generation sequencing data remains challenging. In this study, DNA from a Chinese family quartet is sequenced at three different sequencing centers in triplicate. A total of 288 derivative data sets are generated utilizing different analysis pipelines and compared to identify sources of analytical variability. Mapping methods provide the major contribution to variability, followed by sequencing centers and replicates. Interestingly, SV supported by only one center or replicate often represent true positives with 47.02% and 45.44% overlapping the long-read SV call set, respectively. This is consistent with an overall higher false negative rate for SV calling in centers and replicates compared to mappers (15.72%). Finally, we observe that the SV calling variability also persists in a genotyping approach, indicating the impact of the underlying sequencing and preparation approaches. This study provides the first detailed insights into the sources of variability in SV identification from next-generation sequencing and highlights remaining challenges in SV calling for large cohorts. We further give recommendations on how to reduce SV calling variability and the choice of alignment methodology.

 

Dr. Fritz Sedlazeck, PhD

Dr. Fritz Sedlazeck, PhD

Associate Professor, Human Genome Sequencing Center, Baylor College of Medicine

MAQC Society – 2021 SEQC2 Webinar Series

It is an FDA‐led community wide consortium effort to address issues relating to the application of constantly evolving high‐throughput genomics technologies to either assess safety and efficacy of FDA regulated products or their safe and effective use in clinical applications as in vitro diagnostic devices. The MAQC consortium completed three projects between 2005 ‐2014 (namely MAQC I, II and III), resulting in ~30 publication. The fourth phase of efforts are captured under the initiative of Sequencing Quality Control (SEQC2) which will be covered in this webinar series. Previous webinars will be available to all MAQC Society members.

Previous Webinars and Video Recordings

Tuesday, February 16th, 2021

A MAQC/SEQC Journey Towards Reproducible Genomics and the MAQC Society

Abstract

A history of the MAQC/SEQC consortiums is presented leading to standardization of practices in genomic science and a series of highly cited research papers. The MAQC Society was born from these efforts, uniting scientists with a commitment to reproducible research and quality practices in an era of massive data, collected to understand cellular and molecular biology and advance human health. We will also provide an overview of the SEQC2 Webinar series and the MAQC Society’s annual meeting which will be held virtually this year in a Joint Meeting with MCBIOS April 28-30, 2021.
Weida Tong, PhD

Weida Tong, PhD

President, MCBIOS, and Director, Division of Bioinformatics and Biostatistics, FDA-NCTR

Wendell Jones, PhD

Wendell Jones, PhD

Executive Chair and President, MAQC Society, and Principal Bioinformaticist and Scientific Advisor, Q2 Solutions / EA Genomics

Dr. Wenming Xiao

Dr. Wenming Xiao

Senior Scientific Reviewer, Division of Molecular Genetics and Pathology, FDA-OIR

Tuesday, February 23, 2021

Towards Best Practice in Cancer Mutation Detection with Whole-genome and Whole-exome Sequencing

Abstract

Clinical applications of precision oncology require accurate tests that can distinguish cancer-specific mutations from errors introduced at each step of next generation sequencing (NGS). For NGS to successfully improve patient lives, discriminating between true mutations and artifacts is crucial. To date, no study has addressed the effects of cross site reproducibility together with the potentially influential interactions between biological, technical, and computational factors on the accurate identification of variants.

Here we systematically interrogated somatic mutations in paired tumor-normal cell lines to identify factors affecting detection reproducibility and accuracy. Different types of samples with varying input amount and tumor purity were processed using multiple library construction protocols. Whole-genome (WGS) and whole-exome sequencing (WES) were carried out at six sequencing centers followed by processing with nine bioinformatics pipelines to evaluate reproducibility. We identified artifacts of C>A mutations in WES due to sample and library processing and highlighted limitations of bioinformatics tools for artifact detection and removal.

Biography

Dr. Xiao had advanced training in biology and computer science in China and United States. He has numerous publications in peer-reviewed journals such as Nature, PNAS, and N. Engl. J. Med and received NIH Director Award in 2010 in recognition to his contributions to cancer biomarkers discovery. Dr Xiao was a principle investigator in FDA and led an international working group to establish reference materials, data sets, analysis pipelines, and quality metrics for cancer mutation detection with NGS technology. Currently, Dr. Xiao is a lead reviewer for NGS related diagnosis products/applications (510k, IDE or PMA), including: Onco-Panel, Whole-Exome Panel, NIPT, Gene Expression Signature, and analytical software (bioinformatics pipelines, knowledge databases), and provides recommendation on regulatory decisions regarding the safety and effectiveness of medical devices.

March 2, 2021, 11:00 am - 12:00 Noon ET

Establishing Reference Data and Call Sets for Benchmarking Cancer Mutation Detection using Whole-genome Sequencing

Abstract

We characterized two reference samples for NGS technologies: a human triple-negative breast cancer cell line and a B lymphocyte cell line from the same donor. Leveraging several whole-genome sequencing (WGS) platforms, multiple sequencing replicates, and orthogonal mutation detection bioinformatics pipelines, we minimized the potential biases from sequencing technologies, assays, and informatics. Thus, our “reference call sets” were defined using evidence from 21 replicates of Illumina WGS runs with coverage ranging from 50X to 100X (1300X in total). These call sets present many relevant variants/mutations, including 208 COSMIC mutations and 9,016 germline variants from the ClinVar database, nonsense mutations in BRCA1/2 and missense mutations in TP53 and FGFR1. Independent validation in three orthogonal experiments demonstrated a successful stress test of the call set. We expect these reference samples, high confidence regions, and call sets to facilitate assay development, qualification, validation, process control, and proficiency testing. In addition, our methods can be extended to establish new fully characterized reference samples for the community.

 

Biography

Dr. Li Tai Fang is currently a Staff Scientist at Endpoint Health, Inc. He worked on the SEQC2’s somatic reference project when he was at Roche Sequencing Solutions. Previously at Bina Technologies Inc., he led Bina Team’s participation in the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (#1 and #2 in Stage 5 indel and SNV sub-challenges), and developed software such as SomaticSeq.

Li Tai Fang, PhD

Li Tai Fang, PhD

Staff Scientist, Endpoint Health, Inc.

Charles Wang, MD, PhD, MPH

Charles Wang, MD, PhD, MPH

Professor and Director of Center for Genomics, Loma Linda University School of Medicine

Tuesday, March 9, 2021 11:00 am - 12:00 Noon ET

A Multicenter Study Benchmarking Single-cell RNA Sequencing Technologies using Reference Samples

Abstract

Comparing diverse single-cell RNA sequencing (scRNA-seq) datasets generated by different technologies and in different laboratories remains a major challenge. Here we address the need for guidance in choosing algorithms leading to accurate biological interpretations of varied data types acquired with different platforms. Using two well-characterized cellular reference samples (breast cancer cells and B cells), captured either separately or in mixtures, we compared different scRNA-seq platforms and several preprocessing, normalization and batch-effect correction methods at multiple centers. Although preprocessing and normalization contributed to variability in gene detection and cell classification, batch-effect correction was by far the most important factor in correctly classifying the cells. Moreover, scRNA-seq dataset characteristics (for example, sample and cellular heterogeneity and platform used) were critical in determining the optimal bioinformatic method. However, reproducibility across centers and platforms was high when appropriate bioinformatic methods were applied. Our findings offer practical guidance for optimizing platform and software selection when designing an scRNA-seq study.

 

Companion single-cell Scientific Data paper: https://www.nature.com/articles/s41597-021-00809-x

Biography

Charles Wang, MD, PhD, MPH is Director of the Center for Genomics and a tenured full Professor at the Loma Linda University School of Medicine. He had held the positions as Clinical Transcriptional Genomics Core Director at Cedars-Sinai Medical Center, Associate Professor of Medicine at the David Geffen School of Medicine at UCLA and Director of the Functional Genomics Core at City of Hope. Dr. Wang is a well-recognized expert in genomics, with many high visibility publications published in prestigious journals including Nature Biotechnology, Nature Communications and PNAS. He was one of the pioneers for the MAQC- and SEQC-consortium projects.

Tuesday, March 16, 2021 11:00 am - 12:00 Noon ET

Critical Assessment of Copy Number Variation Calling Using Next Generation Sequencing

Abstract

Accurate detection of Copy Number Variations (CNVs) is crucial in cancer diagnosis and treatment.  In this study, using next generation sequencing (NGS), we systematically interrogated somatic CNVs in paired tumor-normal cell lines to identify factors affecting its detection reproducibility and accuracy such as sequencing depth, amount and type of input DNA for library preparations, and tumor purity. Whole-genome (WGS) and whole-exome sequencing were carried out and processed with different CNV callers to assess their reproducibility. Our evaluations indicate variations among CNV calls are mainly driven by callers and less by the sequencing sites. We observed tumor purity to have a dominant effect in caller’s performance while effect of other confounding factors varies and are mainly caller-specific. Taking Cytogenetics array as the High Confident calls, we observed that WGS consensus call can produce high confidence CNV calls. We provide actionable recommendations and best practices for CNV detection applications in cancer research.

 

Biography

Mehdi Pirooznia, M.D., Ph.D. is the Director of Bioinformatics and Computational Biology core at the National Heart Lung and Blood Institute at the NIH (NHLBI/NIH). Dr. Pirooznia supervises and spearheads this effort by providing bioinformatics analyses support for intramural scientists in life sciences, clinical and translational research. In particular, his group specializes in analyses pertaining to next-generation sequencing and biomedical informatics in genomics, transcriptomics, epigenomics and disease biomarkers. Towards this end, Dr. Pirooznia’s team takes an integrative approach to incorporate site-specific sequence variations changes with gene expression and proteomics data to investigate molecular mechanisms underlying disease progression and treatment responses.

Dr. Pirooznia is also an Adjunct Associate Professor at the Johns Hopkins University School of Medicine, where he served for 8 years as a faculty prior to joining the NIH in 2016, and provided leadership, scientific direction and was responsible for implementing the high-performance computational laboratory and bioinformatics system.  Dr. Pirooznia serves as an editor and reviewer for several scientific journals such as Bioinformatics, Nature Scientific Data, BMC Bioinformatics, and Human Genomics. He is also a member of The American Society of Human Genetics (ASHG), and the International Society for Computational Biology (ISCB).

Mehdi Pirooznia , MD, MSc, PhD

Mehdi Pirooznia , MD, MSc, PhD

Director of Bioinformatics and Computational Biology Core, NHLBI

Wendell Jones, PhD

Wendell Jones, PhD

Principal Bioinformaticist and Scientific Advisor, Q2 Solutions / EA Genomics - Executive Chair and President, MAQC Society

March 23, 2021, 11:00 am - 12:00 Noon ET

A Verified Genomic Reference Sample for Assessing Performance of Cancer Panels Detecting Small Variants of Low Allele Frequency

Abstract

Oncopanel genomic testing, which identifies important somatic variants, is increasingly common in medical practice and especially in clinical trials.  Currently, there is a paucity of reliable genomic reference samples having a suitably large number of preidentified variants for properly assessing oncopanel assay analytical quality and performance. The FDA-led Sequencing and Quality Control Phase 2 (SEQC2) consortium analyzed ten diverse cancer cell lines individually and their pool (termed Sample A) to develop a reference sample with suitably large numbers of coding positions with known (variant) positives and negatives for properly evaluating oncopanel analytical performance. In our reference Sample A, the SEQC2 identified tens of thousands of variants down to 1% allele frequency (AF) with more than 25,000 variants having less than 20% AF with 1653 variants in COSMIC-related genes. This is 5x-100x more than existing commercially available samples. We also identified an unprecedented number of negative positions in coding regions, allowing statistical rigor in assessing limit-of-detection (LOD), sensitivity, and precision. Over 300 loci were randomly selected and independently verified via ddPCR with 100% concordance. Agilent normal reference Sample B can be admixed with Sample A to create new samples with a similar number of known variants at much lower AF than what exists in Sample A natively, including known variants having AF=0.02%, a range suitable for assessing liquid biopsy panels.

Biography

Dr. Wendell Jones is consulting statistician/bioinformaticist in genomic technologies and systems, serving in leadership roles within FDA-led consortia such as the Sequencing Quality Control (SEQC and SEQC2) consortiums. He is the current President and Executive Chair of the MAQC Society, an organization devoted to reproducible science and research.   These FDA consortia efforts resulted (and continue to result) in a series of highly cited papers in Nature Biotechnology over the last sixteen years.  At EA Genomics, Wendell was Vice President of Statistics and Bioinformatics, Global Head of Bioinformatics

March 30, 2021, 11:00 am - 12:00 Noon ET

Multi-lab Cross-oncopanel Study Reveals High Sensitivity and Reproducibility Tailored to Targeted Regions and Allele Frequency Ranges

Abstract

Best practices for oncopanel sequencing, a tool in cancer diagnosis and treatment, requires comprehensive assessments of reproducibility and detection sensitivity. By employing reference materials characterized by the FDA-led SEQC project phase2 (SEQC2), we performed a cross-platform multi-lab evaluation of eight Pan-Cancer panels representing a broad spectrum of oncopanel technologies. The study reveals consistently high sensitivity across targeted high confidence coding regions, variant types (SNVs vs small indels or MNVs) for the variant allele frequency (VAF) above 5%. Sensitivity was reduced by utilizing VAF thresholds due to inherent variability in VAF measurements. Conversely, enforcing a VAF threshold for reporting had a positive impact on reducing false positive (FP) calls. All panels have low FP rates of approximately 1 FP per Mb or less for VAF greater than 5% in the high confidence coding regions, and thus led to good reproducibility. Importantly, the FP rate was found to be noticeably and significantly higher outside the high confidence coding regions, resulting in lower reproducibility. Region restriction and VAF thresholds led to low relative technical variability in estimating the promising biomarkers such as tumor mutational burden. This study details actionable insights into factors underpinning the sensitivity and reproducibility of oncopanel sequencing.

Biography

Dr. Binsheng Gong received a bachelor’s degree in medicine in 2003, and a Ph.D. in biophysics from Harbin Medical University, China. In 2003, he joined Harbin Medical University as a teaching and research assistant, then lecturer and associated professor. In March 2012, Dr. Gong started his research at FDA/NCTR. Since then, he has been involved as one of the major investigators of the FDA-led Sequencing Quality Control (SEQC) and SEQC-II projects. Dr. Gong has more than 18 years of distinguished research and education experiences and a record of exceptional scientific accomplishments in bioinformatics, with emphasis on the next-generation sequencing (NGS) and microarray technologies.
Binsheng Gong, PhD

Binsheng Gong, PhD

Staff Fellow, Division of Bioinformatics and Biostatistics, FDA-NCTR

Joshua Xu, PhD

Joshua Xu, PhD

Branch Chief for Research-to-Review — Division of Bioinformatics and Biostatistics, FDA-NCTR and Executive Secretary, MAQC Society

April 6, 2021, 11:00 am - 12:00 Noon ET

Evaluating the Analytical Validity of Circulating Tumor DNA Sequencing for Precision Oncology

Abstract

Sequencing of circulating tumor DNA (ctDNA), liquid biopsy, can inform cancer diagnosis and clinical management, and is being rapidly adopted to advance precision oncology. However, scarce ctDNA fragments in circulation makes their reliable detection highly challenging. A multi-site, cross-platform assessment of analytical performance was performed for 5 ctDNA assays. Simulated and synthetic experiments, and proficiency testing across 12 participating laboratories, using standardized reference samples was completed. The study revealed that ctDNA mutations of high fraction can be detected with high sensitivity, precision and reproducibility, by all participating assays. However, detection of mutations below ~0.5% was generally unreliable and varied widely between assays, especially when input material is limited. Missed (false-negative) mutations are more common than the erroneous detection (false-positive) due to ctDNA scarcity and stochastic effects. This study is the most comprehensive evaluation of analytical performance among ctDNA assays to date, informs best practice guidelines, and constitutes a vital resource for precision oncology.

Biography

Dr. Xu is the Branch Chief for Research-to-Review (R2R) at the Division of Bioinformatics and Biostatistics of FDA’s National Center for Toxicological Research (NCTR). He specializes in genomics, big data, image analysis, and machine learning. His recent endeavor has been with the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project to evaluate the technical reliabilities and scientific applications of the next generation sequencing (NGS) technologies. He is leading the Oncopanel Sequencing Working Group to assess the reproducibility and detection sensitivity of onco-panel sequencing including liquid biopsy. He is also the executive secretary for MAQC Society.

April 7, 2021, 11:00 am - 12:00 Noon ET

Robust Cancer Mutation Detection with Deep Learning Models Derived from Tumor-Normal Sequencing Data

Abstract

Accurate detection of somatic mutations is challenging but critical to the understanding of cancer formation, progression, and treatment. We recently proposed NeuSomatic, the first deep convolutional neural network based somatic mutation detection approach and demonstrated performance advantages on in silico data. In this study, we used the first comprehensive and well-characterized somatic reference samples from the SEQC-II consortium to investigate best practices for utilizing deep learning framework in cancer mutation detection. Using the high-confidence somatic mutations established for these reference samples by the consortium, we identified strategies for building robust models on multiple datasets derived from samples representing real scenarios. The proposed strategies achieved high robustness across multiple sequencing technologies such as WGS, WES, AmpliSeq target sequencing for fresh and FFPE DNA input, varying tumor/normal purities, and different coverages (ranging from 10x – 2000x). NeuSomatic significantly outperformed conventional detection approaches in general, as well as in challenging situations such as low coverage, low mutation frequency, DNA damage, and difficult genomic regions.

Biography

Mohammad Sahraeian is currently a Senior Principal Bioinformatics Scientist at Roche Sequencing Solutions working on genomic data analysis. Before joining Roche, he was a postdoctoral researcher at UC Berkeley. In 2013, he earned his Ph.D. in Electrical and Computer Engineering at Texas A&M University. He is the coauthor of NeuSomatic, the first convolutional neural network-based approach for cancer mutation detection. Based on this AI-based variant calling approach, he led Roche’s participation in the Precision FDA’s Truth Challenge V2, with the best performer recognition in two categories. He is particularly interested in developing new solutions to genomics problems that leverage recent developments in deep learning.
 Mohammad Sahraeian, PhD

Mohammad Sahraeian, PhD

Senior Bioinformatics Scientist, Roche Sequencing Solutions

Christopher Mason, PhD

Christopher Mason, PhD

Associate Professor; Director, WorldQuant Initiative for Quantitative Prediction Physiology and Biophysics/Feil Family Brain and Mind Institute/Institute for Computational Biomedicine, Weill Cornell Medicine

April 13, 2021, 11:00 am - 12:00 Noon ET

The Epigenome Quality Control (EpiQC) Project

Abstract

Detection of DNA modifications such as 5-methylcytosine (5mC), N6-methyladenosine (m6A), and 5-hydroxy-methylcytosine (5hmc) are essential for delineating the epigenetic changes that guide development, cellular lineage specification, and disease across all kingdoms of life.  However, a proliferation of molecular methods to discern these loci and their phased haplotypes (epialleles) have created the need for standardized materials, methods, and rigorous benchmarking, which can then further enable and improve their applications to clinical and research projects.  Here were report a multi-platform, multi-site assessment and global resource for epigenetics research from the FDA’s Epigenomics Quality Control (EpiQC) Group. The general study design primarily leverages 7 human cell lines publicly available from the National Institute of Standards and Technology (NIST) Genome in a Bottle (GIAB) Consortium. Our primary focus is on 5C modifications found in mammalian genomes (5mC, 5hmC). Each sample was processed for whole-genome bisulfite sequencing (WGBS), oxidative bisulfite sequencing (oxBS), and an APOBEC deaminated reference data set.  We also include a rigorous assessment and comparison to the 450K, 850K Illumina methylation chips, as well as a characterization of the Illumina Methylation-EPIC Capture assay. We also included ATAC-Seq, an increasingly popular chromatin conformation assay. All the libraries generated were sequenced at one location, on Illumina’s NovaSeq 6000. Our goal is to provide data, methods, and algorithms that will be instrumental for ongoing efforts in epigenomics research, DNA modification detection algorithms, and applications to gene regulation, clinical diagnostics, and systems biology.

Biography

Dr. Mason utilizes computational and experimental methodologies to identify and characterize the essential genetic elements that guide the function of the human genome, with a particular emphasis on the elements that orchestrate the development of the human brain. Our lab creates detailed cell-specific molecular maps of genetic, epigenetic, transcriptional, and translational activity, creating a draft of the molecular recipe for the creation of the brain. We also develop methods to detect, catalog and functionally annotate variants in the genetic pathways that control developmental processes and how they are perturbed to create disease. We aim to understand of the functional elements of the human genome well enough to enable, eventually, the ability to repair, re-engineer, or fortify these genetic networks within human cells.

April 14, 2021, 11:00 am - 12:00 Noon ET

Multi-Platform Assessment of DNA Sequencing Performance using Human and Bacterial Reference Genomes in the ABRF Next-Generation Sequencing Study

Abstract

Massively parallel DNA sequencing is a critical tool for genomics research and clinical diagnostics, but few comprehensive resources are available to assess performance across a wide range of sample types, sequencing platforms, and sites. Here, we describe the Association of Biomolecular Resource Facilities (ABRF) Next-Generation Sequencing Phase II Study to measure quality and reproducibility of DNA sequencing over a range of genomic compositions, as a complement to the Phase I RNA-seq study. Inter- and intra-laboratory replicates of human and bacterial reference DNA samples were analyzed by whole exome and whole genome methods on Illumina, BGI, Oxford Nanopore, PacBio, and ThermoFisher Ion Torrent platforms. The data were highly consistent within laboratories, even with a wide variety of sequencing depths, but showed reduced rates of reproducibility in regions of extreme GC content and in complex, highly repetitive genomic contexts and between laboratories. Mappability of reads, genic coverage, and error rate were also variable with respect to GC content and genomic region, and different platforms showed distinct patterns of variant detection, particularly in structural variants (SVs). This study provides a comprehensive baseline resource for continual benchmarking as chemistries, methods, and platforms evolve for DNA sequencing.

Biography

I am a bioinformatician interested in developing pipelines for functional genomic analysis. I am proficient in Python, R, and bash, and use these languages to analyze high-throughput sequence data. Currently I am developing reference materials for sequencing reproducibility and epigenetic landscapes of well characterized cell lines.

With a background in evolutionary biology, I researched for my PhD the phylogenomics of Myxozoa, a group of bizarre, microscopic endoparasites of economically critical fish (among other hosts) around the world. I explored the genomic underpinnings of how myxozoans, whose closest living relatives are medusozoans (jellyfish, hydras, cube jellies), became so radically different in biology, ecology, and life history.

Jonathan Foox, PhD

Jonathan Foox, PhD

Research Associate in Computational Biomedicine, Weill Cornell Medicine

Huixiao Hong, PhD

Huixiao Hong, PhD

Chief, Bioinformatics Branch, FDA-NCTR

April 20, 2021, 11:00 AM - 12:00 Noon ET

Assessing Reproducibility of Germline Variants Detected with Short-read Whole Genome Sequencing

Biography

Huixiao Hong is the chief of Bioinformatics Branch, at National Center for Toxicological Research, FDA. He was the manager of the Bioinformatics Division at Z-Tech, a research scientist position at Sumitomo Chemical Company in Japan, a visiting scientist at NIH, an associate professor and the director of Computational Chemistry Laboratory at Nanjing University in China. Dr. Hong is a member of OpenTox steering committee, the board directors of MCBIOS, and the leadership circle of FDA modeling and simulation working group. He received his Ph.D. from Nanjing University in China, followed by conducting research in Leeds University in England. He has > 210 publications.

Abstract

Reproducible germline variant detection with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. To dissect the impacts of related factors, we sequenced triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and called variants with 56 pipelines. We found that bioinformatics pipelines (callers and aligners) had a larger impact on variant reproducibility than WGS platform or library preparation. Single nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when >5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30X. Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.