Webinars

MAQC Society – SEQC2 Webinar Series

It is an FDA‐led community wide consortium effort to address issues relating to the application of constantly evolving high‐throughput genomics technologies to either assess safety and efficacy of FDA regulated products or their safe and effective use in clinical applications as in vitro diagnostic devices. The MAQC consortium completed three projects between 2005 ‐2014 (namely MAQC I, II and III), resulting in ~30 publication. The fourth phase of efforts are captured under the initiative of Sequencing Quality Control (SEQC2) which will be covered in this webinar series. Shortly, previous webinars will be available to all MAQC Society members.

Previous Webinars and Video Recordings

Tuesday, February 16th, 2021

A MAQC/SEQC Journey Towards Reproducible Genomics and the MAQC Society

Abstract

A history of the MAQC/SEQC consortiums is presented leading to standardization of practices in genomic science and a series of highly cited research papers. The MAQC Society was born from these efforts, uniting scientists with a commitment to reproducible research and quality practices in an era of massive data, collected to understand cellular and molecular biology and advance human health. We will also provide an overview of the SEQC2 Webinar series and the MAQC Society’s annual meeting which will be held virtually this year in a Joint Meeting with MCBIOS April 28-30, 2021.
Weida Tong, PhD

Weida Tong, PhD

President, MCBIOS, and Director, Division of Bioinformatics and Biostatistics, FDA-NCTR

Wendell Jones, PhD

Wendell Jones, PhD

Executive Chair and President, MAQC Society, and Principal Bioinformaticist and Scientific Advisor, Q2 Solutions / EA Genomics

Dr. Wenming Xiao

Dr. Wenming Xiao

Senior Scientific Reviewer, Division of Molecular Genetics and Pathology, FDA-OIR

Tuesday, February 23, 2021

Towards Best Practice in Cancer Mutation Detection with Whole-genome and Whole-exome Sequencing

Abstract

Clinical applications of precision oncology require accurate tests that can distinguish cancer-specific mutations from errors introduced at each step of next generation sequencing (NGS). For NGS to successfully improve patient lives, discriminating between true mutations and artifacts is crucial. To date, no study has addressed the effects of cross site reproducibility together with the potentially influential interactions between biological, technical, and computational factors on the accurate identification of variants.

Here we systematically interrogated somatic mutations in paired tumor-normal cell lines to identify factors affecting detection reproducibility and accuracy. Different types of samples with varying input amount and tumor purity were processed using multiple library construction protocols. Whole-genome (WGS) and whole-exome sequencing (WES) were carried out at six sequencing centers followed by processing with nine bioinformatics pipelines to evaluate reproducibility. We identified artifacts of C>A mutations in WES due to sample and library processing and highlighted limitations of bioinformatics tools for artifact detection and removal.

Biography

Dr. Xiao had advanced training in biology and computer science in China and United States. He has numerous publications in peer-reviewed journals such as Nature, PNAS, and N. Engl. J. Med and received NIH Director Award in 2010 in recognition to his contributions to cancer biomarkers discovery. Dr Xiao was a principle investigator in FDA and led an international working group to establish reference materials, data sets, analysis pipelines, and quality metrics for cancer mutation detection with NGS technology. Currently, Dr. Xiao is a lead reviewer for NGS related diagnosis products/applications (510k, IDE or PMA), including: Onco-Panel, Whole-Exome Panel, NIPT, Gene Expression Signature, and analytical software (bioinformatics pipelines, knowledge databases), and provides recommendation on regulatory decisions regarding the safety and effectiveness of medical devices.

March 2, 2021, 11:00 am - 12:00 Noon ET

Establishing Reference Data and Call Sets for Benchmarking Cancer Mutation Detection using Whole-genome Sequencing

Abstract

We characterized two reference samples for NGS technologies: a human triple-negative breast cancer cell line and a B lymphocyte cell line from the same donor. Leveraging several whole-genome sequencing (WGS) platforms, multiple sequencing replicates, and orthogonal mutation detection bioinformatics pipelines, we minimized the potential biases from sequencing technologies, assays, and informatics. Thus, our “reference call sets” were defined using evidence from 21 replicates of Illumina WGS runs with coverage ranging from 50X to 100X (1300X in total). These call sets present many relevant variants/mutations, including 208 COSMIC mutations and 9,016 germline variants from the ClinVar database, nonsense mutations in BRCA1/2 and missense mutations in TP53 and FGFR1. Independent validation in three orthogonal experiments demonstrated a successful stress test of the call set. We expect these reference samples, high confidence regions, and call sets to facilitate assay development, qualification, validation, process control, and proficiency testing. In addition, our methods can be extended to establish new fully characterized reference samples for the community.

 

Biography

Dr. Li Tai Fang is currently a Staff Scientist at Endpoint Health, Inc. He worked on the SEQC2’s somatic reference project when he was at Roche Sequencing Solutions. Previously at Bina Technologies Inc., he led Bina Team’s participation in the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (#1 and #2 in Stage 5 indel and SNV sub-challenges), and developed software such as SomaticSeq.

Li Tai Fang, PhD

Li Tai Fang, PhD

Staff Scientist, Endpoint Health, Inc.

Charles Wang, MD, PhD, MPH

Charles Wang, MD, PhD, MPH

Professor and Director of Center for Genomics, Loma Linda University School of Medicine

Tuesday, March 9, 2021 11:00 am - 12:00 Noon ET

A Multicenter Study Benchmarking Single-cell RNA Sequencing Technologies using Reference Samples

Abstract

Comparing diverse single-cell RNA sequencing (scRNA-seq) datasets generated by different technologies and in different laboratories remains a major challenge. Here we address the need for guidance in choosing algorithms leading to accurate biological interpretations of varied data types acquired with different platforms. Using two well-characterized cellular reference samples (breast cancer cells and B cells), captured either separately or in mixtures, we compared different scRNA-seq platforms and several preprocessing, normalization and batch-effect correction methods at multiple centers. Although preprocessing and normalization contributed to variability in gene detection and cell classification, batch-effect correction was by far the most important factor in correctly classifying the cells. Moreover, scRNA-seq dataset characteristics (for example, sample and cellular heterogeneity and platform used) were critical in determining the optimal bioinformatic method. However, reproducibility across centers and platforms was high when appropriate bioinformatic methods were applied. Our findings offer practical guidance for optimizing platform and software selection when designing an scRNA-seq study.

 

Companion single-cell Scientific Data paper: https://www.nature.com/articles/s41597-021-00809-x

Biography

Charles Wang, MD, PhD, MPH is Director of the Center for Genomics and a tenured full Professor at the Loma Linda University School of Medicine. He had held the positions as Clinical Transcriptional Genomics Core Director at Cedars-Sinai Medical Center, Associate Professor of Medicine at the David Geffen School of Medicine at UCLA and Director of the Functional Genomics Core at City of Hope. Dr. Wang is a well-recognized expert in genomics, with many high visibility publications published in prestigious journals including Nature Biotechnology, Nature Communications and PNAS. He was one of the pioneers for the MAQC- and SEQC-consortium projects.

Tuesday, March 16, 2021 11:00 am - 12:00 Noon ET

Critical Assessment of Copy Number Variation Calling Using Next Generation Sequencing

Abstract

Accurate detection of Copy Number Variations (CNVs) is crucial in cancer diagnosis and treatment.  In this study, using next generation sequencing (NGS), we systematically interrogated somatic CNVs in paired tumor-normal cell lines to identify factors affecting its detection reproducibility and accuracy such as sequencing depth, amount and type of input DNA for library preparations, and tumor purity. Whole-genome (WGS) and whole-exome sequencing were carried out and processed with different CNV callers to assess their reproducibility. Our evaluations indicate variations among CNV calls are mainly driven by callers and less by the sequencing sites. We observed tumor purity to have a dominant effect in caller’s performance while effect of other confounding factors varies and are mainly caller-specific. Taking Cytogenetics array as the High Confident calls, we observed that WGS consensus call can produce high confidence CNV calls. We provide actionable recommendations and best practices for CNV detection applications in cancer research.

 

Biography

Mehdi Pirooznia, M.D., Ph.D. is the Director of Bioinformatics and Computational Biology core at the National Heart Lung and Blood Institute at the NIH (NHLBI/NIH). Dr. Pirooznia supervises and spearheads this effort by providing bioinformatics analyses support for intramural scientists in life sciences, clinical and translational research. In particular, his group specializes in analyses pertaining to next-generation sequencing and biomedical informatics in genomics, transcriptomics, epigenomics and disease biomarkers. Towards this end, Dr. Pirooznia’s team takes an integrative approach to incorporate site-specific sequence variations changes with gene expression and proteomics data to investigate molecular mechanisms underlying disease progression and treatment responses.

Dr. Pirooznia is also an Adjunct Associate Professor at the Johns Hopkins University School of Medicine, where he served for 8 years as a faculty prior to joining the NIH in 2016, and provided leadership, scientific direction and was responsible for implementing the high-performance computational laboratory and bioinformatics system.  Dr. Pirooznia serves as an editor and reviewer for several scientific journals such as Bioinformatics, Nature Scientific Data, BMC Bioinformatics, and Human Genomics. He is also a member of The American Society of Human Genetics (ASHG), and the International Society for Computational Biology (ISCB).

Mehdi Pirooznia , MD, MSc, PhD

Mehdi Pirooznia , MD, MSc, PhD

Director of Bioinformatics and Computational Biology Core, NHLBI

Wendell Jones, PhD

Wendell Jones, PhD

Principal Bioinformaticist and Scientific Advisor, Q2 Solutions / EA Genomics - Executive Chair and President, MAQC Society

March 23, 2021, 11:00 am - 12:00 Noon ET

A Verified Genomic Reference Sample for Assessing Performance of Cancer Panels Detecting Small Variants of Low Allele Frequency

Abstract

Oncopanel genomic testing, which identifies important somatic variants, is increasingly common in medical practice and especially in clinical trials.  Currently, there is a paucity of reliable genomic reference samples having a suitably large number of preidentified variants for properly assessing oncopanel assay analytical quality and performance. The FDA-led Sequencing and Quality Control Phase 2 (SEQC2) consortium analyzed ten diverse cancer cell lines individually and their pool (termed Sample A) to develop a reference sample with suitably large numbers of coding positions with known (variant) positives and negatives for properly evaluating oncopanel analytical performance. In our reference Sample A, the SEQC2 identified tens of thousands of variants down to 1% allele frequency (AF) with more than 25,000 variants having less than 20% AF with 1653 variants in COSMIC-related genes. This is 5x-100x more than existing commercially available samples. We also identified an unprecedented number of negative positions in coding regions, allowing statistical rigor in assessing limit-of-detection (LOD), sensitivity, and precision. Over 300 loci were randomly selected and independently verified via ddPCR with 100% concordance. Agilent normal reference Sample B can be admixed with Sample A to create new samples with a similar number of known variants at much lower AF than what exists in Sample A natively, including known variants having AF=0.02%, a range suitable for assessing liquid biopsy panels.

Biography

Dr. Wendell Jones is consulting statistician/bioinformaticist in genomic technologies and systems, serving in leadership roles within FDA-led consortia such as the Sequencing Quality Control (SEQC and SEQC2) consortiums. He is the current President and Executive Chair of the MAQC Society, an organization devoted to reproducible science and research.   These FDA consortia efforts resulted (and continue to result) in a series of highly cited papers in Nature Biotechnology over the last sixteen years.  At EA Genomics, Wendell was Vice President of Statistics and Bioinformatics, Global Head of Bioinformatics

March 30, 2021, 11:00 am - 12:00 Noon ET

Multi-lab Cross-oncopanel Study Reveals High Sensitivity and Reproducibility Tailored to Targeted Regions and Allele Frequency Ranges

Abstract

Best practices for oncopanel sequencing, a tool in cancer diagnosis and treatment, requires comprehensive assessments of reproducibility and detection sensitivity. By employing reference materials characterized by the FDA-led SEQC project phase2 (SEQC2), we performed a cross-platform multi-lab evaluation of eight Pan-Cancer panels representing a broad spectrum of oncopanel technologies. The study reveals consistently high sensitivity across targeted high confidence coding regions, variant types (SNVs vs small indels or MNVs) for the variant allele frequency (VAF) above 5%. Sensitivity was reduced by utilizing VAF thresholds due to inherent variability in VAF measurements. Conversely, enforcing a VAF threshold for reporting had a positive impact on reducing false positive (FP) calls. All panels have low FP rates of approximately 1 FP per Mb or less for VAF greater than 5% in the high confidence coding regions, and thus led to good reproducibility. Importantly, the FP rate was found to be noticeably and significantly higher outside the high confidence coding regions, resulting in lower reproducibility. Region restriction and VAF thresholds led to low relative technical variability in estimating the promising biomarkers such as tumor mutational burden. This study details actionable insights into factors underpinning the sensitivity and reproducibility of oncopanel sequencing.

Biography

Dr. Binsheng Gong received a bachelor’s degree in medicine in 2003, and a Ph.D. in biophysics from Harbin Medical University, China. In 2003, he joined Harbin Medical University as a teaching and research assistant, then lecturer and associated professor. In March 2012, Dr. Gong started his research at FDA/NCTR. Since then, he has been involved as one of the major investigators of the FDA-led Sequencing Quality Control (SEQC) and SEQC-II projects. Dr. Gong has more than 18 years of distinguished research and education experiences and a record of exceptional scientific accomplishments in bioinformatics, with emphasis on the next-generation sequencing (NGS) and microarray technologies.
Binsheng Gong, PhD

Binsheng Gong, PhD

Staff Fellow, Division of Bioinformatics and Biostatistics, FDA-NCTR

Joshua Xu, PhD

Joshua Xu, PhD

Branch Chief for Research-to-Review — Division of Bioinformatics and Biostatistics, FDA-NCTR and Executive Secretary, MAQC Society

April 6, 2021, 11:00 am - 12:00 Noon ET

Evaluating the Analytical Validity of Circulating Tumor DNA Sequencing for Precision Oncology

Abstract

Sequencing of circulating tumor DNA (ctDNA), liquid biopsy, can inform cancer diagnosis and clinical management, and is being rapidly adopted to advance precision oncology. However, scarce ctDNA fragments in circulation makes their reliable detection highly challenging. A multi-site, cross-platform assessment of analytical performance was performed for 5 ctDNA assays. Simulated and synthetic experiments, and proficiency testing across 12 participating laboratories, using standardized reference samples was completed. The study revealed that ctDNA mutations of high fraction can be detected with high sensitivity, precision and reproducibility, by all participating assays. However, detection of mutations below ~0.5% was generally unreliable and varied widely between assays, especially when input material is limited. Missed (false-negative) mutations are more common than the erroneous detection (false-positive) due to ctDNA scarcity and stochastic effects. This study is the most comprehensive evaluation of analytical performance among ctDNA assays to date, informs best practice guidelines, and constitutes a vital resource for precision oncology.

Biography

Dr. Xu is the Branch Chief for Research-to-Review (R2R) at the Division of Bioinformatics and Biostatistics of FDA’s National Center for Toxicological Research (NCTR). He specializes in genomics, big data, image analysis, and machine learning. His recent endeavor has been with the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project to evaluate the technical reliabilities and scientific applications of the next generation sequencing (NGS) technologies. He is leading the Oncopanel Sequencing Working Group to assess the reproducibility and detection sensitivity of onco-panel sequencing including liquid biopsy. He is also the executive secretary for MAQC Society.

April 7, 2021, 11:00 am - 12:00 Noon ET

Robust Cancer Mutation Detection with Deep Learning Models Derived from Tumor-Normal Sequencing Data

Abstract

Accurate detection of somatic mutations is challenging but critical to the understanding of cancer formation, progression, and treatment. We recently proposed NeuSomatic, the first deep convolutional neural network based somatic mutation detection approach and demonstrated performance advantages on in silico data. In this study, we used the first comprehensive and well-characterized somatic reference samples from the SEQC-II consortium to investigate best practices for utilizing deep learning framework in cancer mutation detection. Using the high-confidence somatic mutations established for these reference samples by the consortium, we identified strategies for building robust models on multiple datasets derived from samples representing real scenarios. The proposed strategies achieved high robustness across multiple sequencing technologies such as WGS, WES, AmpliSeq target sequencing for fresh and FFPE DNA input, varying tumor/normal purities, and different coverages (ranging from 10x – 2000x). NeuSomatic significantly outperformed conventional detection approaches in general, as well as in challenging situations such as low coverage, low mutation frequency, DNA damage, and difficult genomic regions.

Biography

Mohammad Sahraeian is currently a Senior Principal Bioinformatics Scientist at Roche Sequencing Solutions working on genomic data analysis. Before joining Roche, he was a postdoctoral researcher at UC Berkeley. In 2013, he earned his Ph.D. in Electrical and Computer Engineering at Texas A&M University. He is the coauthor of NeuSomatic, the first convolutional neural network-based approach for cancer mutation detection. Based on this AI-based variant calling approach, he led Roche’s participation in the Precision FDA’s Truth Challenge V2, with the best performer recognition in two categories. He is particularly interested in developing new solutions to genomics problems that leverage recent developments in deep learning.
 Mohammad Sahraeian, PhD

Mohammad Sahraeian, PhD

Senior Bioinformatics Scientist, Roche Sequencing Solutions

Christopher Mason, PhD

Christopher Mason, PhD

Associate Professor; Director, WorldQuant Initiative for Quantitative Prediction Physiology and Biophysics/Feil Family Brain and Mind Institute/Institute for Computational Biomedicine, Weill Cornell Medicine

April 13, 2021, 11:00 am - 12:00 Noon ET

The Epigenome Quality Control (EpiQC) Project

Abstract

Detection of DNA modifications such as 5-methylcytosine (5mC), N6-methyladenosine (m6A), and 5-hydroxy-methylcytosine (5hmc) are essential for delineating the epigenetic changes that guide development, cellular lineage specification, and disease across all kingdoms of life.  However, a proliferation of molecular methods to discern these loci and their phased haplotypes (epialleles) have created the need for standardized materials, methods, and rigorous benchmarking, which can then further enable and improve their applications to clinical and research projects.  Here were report a multi-platform, multi-site assessment and global resource for epigenetics research from the FDA’s Epigenomics Quality Control (EpiQC) Group. The general study design primarily leverages 7 human cell lines publicly available from the National Institute of Standards and Technology (NIST) Genome in a Bottle (GIAB) Consortium. Our primary focus is on 5C modifications found in mammalian genomes (5mC, 5hmC). Each sample was processed for whole-genome bisulfite sequencing (WGBS), oxidative bisulfite sequencing (oxBS), and an APOBEC deaminated reference data set.  We also include a rigorous assessment and comparison to the 450K, 850K Illumina methylation chips, as well as a characterization of the Illumina Methylation-EPIC Capture assay. We also included ATAC-Seq, an increasingly popular chromatin conformation assay. All the libraries generated were sequenced at one location, on Illumina’s NovaSeq 6000. Our goal is to provide data, methods, and algorithms that will be instrumental for ongoing efforts in epigenomics research, DNA modification detection algorithms, and applications to gene regulation, clinical diagnostics, and systems biology.

Biography

Dr. Mason utilizes computational and experimental methodologies to identify and characterize the essential genetic elements that guide the function of the human genome, with a particular emphasis on the elements that orchestrate the development of the human brain. Our lab creates detailed cell-specific molecular maps of genetic, epigenetic, transcriptional, and translational activity, creating a draft of the molecular recipe for the creation of the brain. We also develop methods to detect, catalog and functionally annotate variants in the genetic pathways that control developmental processes and how they are perturbed to create disease. We aim to understand of the functional elements of the human genome well enough to enable, eventually, the ability to repair, re-engineer, or fortify these genetic networks within human cells.

April 14, 2021, 11:00 am - 12:00 Noon ET

Multi-Platform Assessment of DNA Sequencing Performance using Human and Bacterial Reference Genomes in the ABRF Next-Generation Sequencing Study

Abstract

Massively parallel DNA sequencing is a critical tool for genomics research and clinical diagnostics, but few comprehensive resources are available to assess performance across a wide range of sample types, sequencing platforms, and sites. Here, we describe the Association of Biomolecular Resource Facilities (ABRF) Next-Generation Sequencing Phase II Study to measure quality and reproducibility of DNA sequencing over a range of genomic compositions, as a complement to the Phase I RNA-seq study. Inter- and intra-laboratory replicates of human and bacterial reference DNA samples were analyzed by whole exome and whole genome methods on Illumina, BGI, Oxford Nanopore, PacBio, and ThermoFisher Ion Torrent platforms. The data were highly consistent within laboratories, even with a wide variety of sequencing depths, but showed reduced rates of reproducibility in regions of extreme GC content and in complex, highly repetitive genomic contexts and between laboratories. Mappability of reads, genic coverage, and error rate were also variable with respect to GC content and genomic region, and different platforms showed distinct patterns of variant detection, particularly in structural variants (SVs). This study provides a comprehensive baseline resource for continual benchmarking as chemistries, methods, and platforms evolve for DNA sequencing.

Biography

I am a bioinformatician interested in developing pipelines for functional genomic analysis. I am proficient in Python, R, and bash, and use these languages to analyze high-throughput sequence data. Currently I am developing reference materials for sequencing reproducibility and epigenetic landscapes of well characterized cell lines.

With a background in evolutionary biology, I researched for my PhD the phylogenomics of Myxozoa, a group of bizarre, microscopic endoparasites of economically critical fish (among other hosts) around the world. I explored the genomic underpinnings of how myxozoans, whose closest living relatives are medusozoans (jellyfish, hydras, cube jellies), became so radically different in biology, ecology, and life history.

Jonathan Foox, PhD

Jonathan Foox, PhD

Research Associate in Computational Biomedicine, Weill Cornell Medicine

Huixiao Hong, PhD

Huixiao Hong, PhD

Chief, Bioinformatics Branch, FDA-NCTR

April 20, 2021, 11:00 AM - 12:00 Noon ET

Assessing Reproducibility of Germline Variants Detected with Short-read Whole Genome Sequencing

Biography

Huixiao Hong is the chief of Bioinformatics Branch, at National Center for Toxicological Research, FDA. He was the manager of the Bioinformatics Division at Z-Tech, a research scientist position at Sumitomo Chemical Company in Japan, a visiting scientist at NIH, an associate professor and the director of Computational Chemistry Laboratory at Nanjing University in China. Dr. Hong is a member of OpenTox steering committee, the board directors of MCBIOS, and the leadership circle of FDA modeling and simulation working group. He received his Ph.D. from Nanjing University in China, followed by conducting research in Leeds University in England. He has > 210 publications.

Abstract

Reproducible germline variant detection with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. To dissect the impacts of related factors, we sequenced triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and called variants with 56 pipelines. We found that bioinformatics pipelines (callers and aligners) had a larger impact on variant reproducibility than WGS platform or library preparation. Single nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when >5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30X. Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.