MAQC Society 2022 Spring Webinar Series
- Free registration for six new Webinars on current NGS scientific articles from SEQC2
- March 22 – May 3, 2022 (each Tuesday at 11am ET except for April 26)
March 22 2022, 11AM eastern: X-CNV: Genome-wide prediction of the pathogenicity of copy number variations
Biography: Dr. Zhichao Liu is the technical leader for the Artificial Intelligence Research Force (AIRForce) at Division of Bioinformatics & Biostatistics, FDA/NCTR. Dr. Liu’s background spans the fields of chemistry, biology, and computer science. He led many cutting-edge projects in the past decade by designing, implementing, and deploying AI/machine learning solutions for advanced regulatory sciences. Specifically, Dr. Liu unleased the AI/machine learning solutions for improving the pathogenicity prediction of complex genetic variants, promoting predictive toxicology and facilitating precision medicine applications. His accomplishment was reflected by 5 FDA-wide Awards, 9 NCTR level Awards, 2 Scientific community level awards, and more than 80 peer-reviewed publications.
Abstract: Gene copy number variations (CNVs) contribute to genetic diversity and disease prevalence across populations. Substantial efforts have been made to decipher the relationship between CNVs and pathogenesis but with limited success. We have developed a novel computational framework X-CNV (www.unimd.org/XCNV), to predict the pathogenicity of CNVs by integrating more than 30 informative features such as allele frequency (AF), CNV length, CNV type, and some deleterious scores. Notably, over 14 million CNVs across various ethnic groups, covering nearly 93% of the human genome, were unified to calculate the AF. X-CNV, which yielded area under curve (AUC) values of 0.96 and 0.94 in training and validation sets, was demonstrated to outperform other available tools in terms of CNV pathogenicity prediction. A meta-voting prediction (MVP) score was developed to quantitively measure the pathogenic effect, which is based on the probabilistic value generated from the XGBoost algorithm. The proposed MVP score demonstrated a high discriminative power in determining pathogenetic CNVs for inherited traits/diseases in different ethnic groups. The ability of the X-CNV framework to quantitatively prioritize functional, deleterious, and disease-causing CNV on a genome-wide basis outperformed current CNV-annotation tools and will have broad utility in population genetics, disease-association studies, and diagnostic screening.
Dr. Zhichao Liu, Ph. D.
Artificial Intelligence Research Force (AIRForce) Technical Leader
Dr. James C Willey, Ph.D.
George Isaac Chair for Cancer Research
April 5 2022, 11am eastern: Advancing quality-control for NGS measurement of actionable mutations in circulating tumor DNA
Biography: I am currently Professor of Medicine and Pathology, and George Isaac Chair for Cancer Research with more than 25 years of continuous funding from the National Institutes of Health in translational research. We develop methods to optimize to quality control in next generation sequencing analysis of germline and somatic variants in clinical specimens, including circulating tumor DNA. Since 2005, my laboratory has participated in United States Food and Drug Administration (FDA) MAQC collaborative projects to improve quality control in genetic testing. Through these FDA consortia, as of 2021 I have co-authored twelve articles reporting results from these collaborative projects, including nine in Nature Biotechnology articles, three Genome Biology, and one in Cell Reports Methods.
Abstract: The primary objective of the FDA-led Sequencing and Quality Control Phase 2 (SEQC2) project is to develop standard analysis protocols and quality control metrics for use in DNA testing to enhance scientific research and precision medicine. This study reports a targeted next generation sequencing (NGS) method that enables more accurate detection of actionable mutations in circulating tumor DNA (ctDNA) clinical specimens. This advancement was enabled by designing a synthetic internal standard spike-in for each actionable mutation target, suitable for use in NGS following hybrid-capture enrichment and unique molecular index (UMI) or non-UMI library preparation. When mixed with contrived ctDNA reference samples, internal standards enabled calculation of technical error rate, limit of blank, and limit of detection for each variant at each nucleotide position, in each sample. True positive mutations with variant allele fraction too low for detection by current practice were detected with this method, thereby increasing sensitivity.
April 12 2022, 11am eastern: Deep oncopanel sequencing reveals within block position-dependent quality degradation in FFPE processed samples
Biography: Dr. Xu is the Branch Chief for Research-to-Review (R2R) at the Division of Bioinformatics and Biostatistics of FDA’s National Center for Toxicological Research (NCTR). He specializes in genomics, big data, image analysis, and machine learning. His recent endeavor has been with the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project to evaluate the technical reliabilities and scientific applications of the next generation sequencing (NGS) technologies. He leads the Oncopanel Sequencing Working Group to assess the reproducibility and detection sensitivity of onco-panel sequencing including liquid biopsy. He is also the executive secretary for MAQC Society.
Abstract: Clinical laboratories routinely use formalin-fixed paraffin-embedded (FFPE) tissue or cell block cytology samples in oncology panel sequencing to identify mutations that can predict patient response to targeted therapy. To understand the technical error due to FFPE processing, a robustly characterized diploid cell line was used to create FFPE samples with four different pre-tissue processing formalin fixation times. A total of 96 FFPE sections were distributed to different laboratories for targeted sequencing analysis by four oncopanels, and variants resulting from technical error were identified. Tissue sections that failed more frequently showed low cellularity, lower than recommended library preparation DNA input, or target sequencing depth. Importantly, sections from block surfaces were more likely to show FFPE-specific errors, akin to “edge effects” seen in histology, while the inner samples displayed no quality degradation related to fixation time up to 24 hours. To assure reliable results, we recommend avoiding the block surface portion and restricting mutation detection to genomic regions of high confidence.
Dr. Joshua Xu, Ph.D.
Branch Chief for Research-to-Review — Division of Bioinformatics and Biostatistics, FDA-NCTR & Executive Secretary, MAQC Society
Dr. Timothy R. Mercer, Ph.D.
Group Leader, Institute of Bioengineering and Nanotechnology, University of Queensland, Brisbane, QLD, Garvan Institute of Medical Research, Sydney, NSW, Australia.
April 19 2022, 11am eastern: Resolving the complexity of the human genome using synthetic chromosomal controls & Building a synthetic genome that encodes DNA, mRNA and protein controls
Biography: Timothy Mercer is a Group Leader at the Australian Institute for Bioengineering and Nanotechnology (AIBN) at The University of Queensland where he is the Scientific Director of the BASE nucleic-acid synthesis facility. He also leads a diverse laboratory and bioinformatic research group into the expression and splicing of synthetic genes. Prior to this, he was Group Leader at the Garvan Institute for Medical Research, Sydney, where he pioneered the use of synthetic RNA and DNA controls to improve the accuracy of clinical genome sequencing. He also developed targeted RNA sequencing approaches for the diagnosis of fusion genes in cancer. Together, this reflects his ongoing interest in the development of genome biotechnologies. Before joining the Garvan, Tim Mercer received his PhD in Genomics from UQ and completed postdoctoral studies in transcriptomics, long-noncoding RNAs and splicing at Broad Institute, United States, Centre for Gene Regulation, Spain, and Max Plank Institute for Cell Biology, Germany.
Resolving the complexity of the human genome using synthetic chromosomal controls: Next-generation sequencing (NGS) can identify mutations in the human genome that cause disease and has been widely adopted in clinical diagnosis. However, the human genome contains many polymorphic, low complexity, and repetitive regions that are difficult to sequence and analyse. Despite their difficulty, these regions include many clinically-important sequences that can inform the treatment of human diseases and improve the diagnostic yield of NGS. To evaluate the accuracy by which these difficult regions are analysed with NGS, we built an in silico decoy chromosome, along with corresponding synthetic DNA reference controls, that encode difficult and clinically-important human genome regions, including repeats, microsatellites, HLA genes and immune-receptors. These synthetic chromosome controls provide a known ground-truth reference against which to measure the performance of diverse sequencing technologies, reagents, and bioinformatic tools. Using this approach, we provide a comprehensive evaluation of short- and long-read sequencing instruments, library preparation methods, and software tools, and identify the errors and systematic bias that confound our resolution of these remaining difficult regions. This study provides an analytical validation of diagnosis using NGS in difficult regions of the human genome and highlights the challenges that remain to resolve these difficult regions.
Building a synthetic genome that encodes DNA, mRNA and protein controls. PhiX-174 was the first genome to be sequenced in 1974, and has become the most commonly used standard in sequencing, molecular and synthetic biology. However, given the advent of affordable DNA synthesis and de novo gene design, we considered whether we could build a new genome, termed SynX, that is optimized for use as a molecularstandard. The SynX genome encodes synthetic genes that are organised into paralogous gene families and provide qualitative and quantitative evaluation of next-generation sequencing performance. The synthetic genes can be in vitrotranscribed to form matched synthetic mRNA controls to evaluate RNA sequencing performance. Finally, the synthetic mRNA controls can be in vitro translated to form a matched protein controls for high throughput proteomics methods, such as mass spectrophotometry. The SynX genome can be independently and sustainably prepared, modified and shared by recipient laboratories using common molecular biology techniques, and be widely used as a universal molecular standard.
May 3 2022, 11AM eastern: Hidden biases in germline structural variant detection
Biography: Dr. Fritz Sedlazeck completed his PhD in 2012 in the group of Dr. Arndt von Haeseler at the Max F. Perutz Laboratory in Vienna. After a two year postdoc, he transitioned to the lab of Dr. Michael Schatz at Cold Spring Harbor Laboratory and later to Johns Hopkins University. Since 2017 he leads his own group at the Human Genome Sequencing Centre at Baylor College of Medicine. Dr. Sedlazeck groups focuses on the mechanisms of the formation of SV across multiple species and to improve our understanding how these complex alleles evolve and impact phenotypes.
Abstract: Genomic structural variations (SV) are important determinants of genotypic and phenotypic changes in many organisms. However, the detection of SV from next-generation sequencing data remains challenging. In this study, DNA from a Chinese family quartet is sequenced at three different sequencing centers in triplicate. A total of 288 derivative data sets are generated utilizing different analysis pipelines and compared to identify sources of analytical variability. Mapping methods provide the major contribution to variability, followed by sequencing centers and replicates. Interestingly, SV supported by only one center or replicate often represent true positives with 47.02% and 45.44% overlapping the long-read SV call set, respectively. This is consistent with an overall higher false negative rate for SV calling in centers and replicates compared to mappers (15.72%). Finally, we observe that the SV calling variability also persists in a genotyping approach, indicating the impact of the underlying sequencing and preparation approaches. This study provides the first detailed insights into the sources of variability in SV identification from next-generation sequencing and highlights remaining challenges in SV calling for large cohorts. We further give recommendations on how to reduce SV calling variability and the choice of alignment methodology.
Dr. Fritz Sedlazeck, PhD
Associate Professor, Human Genome Sequencing Center, Baylor College of Medicine
MAQC Society – 2021 SEQC2 Webinar Series
It is an FDA‐led community wide consortium effort to address issues relating to the application of constantly evolving high‐throughput genomics technologies to either assess safety and efficacy of FDA regulated products or their safe and effective use in clinical applications as in vitro diagnostic devices. The MAQC consortium completed three projects between 2005 ‐2014 (namely MAQC I, II and III), resulting in ~30 publication. The fourth phase of efforts are captured under the initiative of Sequencing Quality Control (SEQC2) which will be covered in this webinar series. Previous webinars will be available to all MAQC Society members.
Previous Webinars and Video Recordings
Tuesday, February 16th, 2021
A MAQC/SEQC Journey Towards Reproducible Genomics and the MAQC Society
Weida Tong, PhD
President, MCBIOS, and Director, Division of Bioinformatics and Biostatistics, FDA-NCTR
Wendell Jones, PhD
Executive Chair and President, MAQC Society, and Principal Bioinformaticist and Scientific Advisor, Q2 Solutions / EA Genomics
Dr. Wenming Xiao
Senior Scientific Reviewer, Division of Molecular Genetics and Pathology, FDA-OIR
Tuesday, February 23, 2021
Towards Best Practice in Cancer Mutation Detection with Whole-genome and Whole-exome Sequencing
Here we systematically interrogated somatic mutations in paired tumor-normal cell lines to identify factors affecting detection reproducibility and accuracy. Different types of samples with varying input amount and tumor purity were processed using multiple library construction protocols. Whole-genome (WGS) and whole-exome sequencing (WES) were carried out at six sequencing centers followed by processing with nine bioinformatics pipelines to evaluate reproducibility. We identified artifacts of C>A mutations in WES due to sample and library processing and highlighted limitations of bioinformatics tools for artifact detection and removal.
Dr. Xiao had advanced training in biology and computer science in China and United States. He has numerous publications in peer-reviewed journals such as Nature, PNAS, and N. Engl. J. Med and received NIH Director Award in 2010 in recognition to his contributions to cancer biomarkers discovery. Dr Xiao was a principle investigator in FDA and led an international working group to establish reference materials, data sets, analysis pipelines, and quality metrics for cancer mutation detection with NGS technology. Currently, Dr. Xiao is a lead reviewer for NGS related diagnosis products/applications (510k, IDE or PMA), including: Onco-Panel, Whole-Exome Panel, NIPT, Gene Expression Signature, and analytical software (bioinformatics pipelines, knowledge databases), and provides recommendation on regulatory decisions regarding the safety and effectiveness of medical devices.
March 2, 2021, 11:00 am - 12:00 Noon ET
Establishing Reference Data and Call Sets for Benchmarking Cancer Mutation Detection using Whole-genome Sequencing
Dr. Li Tai Fang is currently a Staff Scientist at Endpoint Health, Inc. He worked on the SEQC2’s somatic reference project when he was at Roche Sequencing Solutions. Previously at Bina Technologies Inc., he led Bina Team’s participation in the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (#1 and #2 in Stage 5 indel and SNV sub-challenges), and developed software such as SomaticSeq.
Li Tai Fang, PhD
Staff Scientist, Endpoint Health, Inc.
Charles Wang, MD, PhD, MPH
Professor and Director of Center for Genomics, Loma Linda University School of Medicine
Tuesday, March 9, 2021 11:00 am - 12:00 Noon ET
A Multicenter Study Benchmarking Single-cell RNA Sequencing Technologies using Reference Samples
Companion single-cell Scientific Data paper: https://www.nature.com/articles/s41597-021-00809-x
Charles Wang, MD, PhD, MPH is Director of the Center for Genomics and a tenured full Professor at the Loma Linda University School of Medicine. He had held the positions as Clinical Transcriptional Genomics Core Director at Cedars-Sinai Medical Center, Associate Professor of Medicine at the David Geffen School of Medicine at UCLA and Director of the Functional Genomics Core at City of Hope. Dr. Wang is a well-recognized expert in genomics, with many high visibility publications published in prestigious journals including Nature Biotechnology, Nature Communications and PNAS. He was one of the pioneers for the MAQC- and SEQC-consortium projects.
Tuesday, March 16, 2021 11:00 am - 12:00 Noon ET
Critical Assessment of Copy Number Variation Calling Using Next Generation Sequencing
Dr. Pirooznia is also an Adjunct Associate Professor at the Johns Hopkins University School of Medicine, where he served for 8 years as a faculty prior to joining the NIH in 2016, and provided leadership, scientific direction and was responsible for implementing the high-performance computational laboratory and bioinformatics system. Dr. Pirooznia serves as an editor and reviewer for several scientific journals such as Bioinformatics, Nature Scientific Data, BMC Bioinformatics, and Human Genomics. He is also a member of The American Society of Human Genetics (ASHG), and the International Society for Computational Biology (ISCB).
Mehdi Pirooznia , MD, MSc, PhD
Director of Bioinformatics and Computational Biology Core, NHLBI
Wendell Jones, PhD
Principal Bioinformaticist and Scientific Advisor, Q2 Solutions / EA Genomics - Executive Chair and President, MAQC Society
March 23, 2021, 11:00 am - 12:00 Noon ET
A Verified Genomic Reference Sample for Assessing Performance of Cancer Panels Detecting Small Variants of Low Allele Frequency
March 30, 2021, 11:00 am - 12:00 Noon ET
Multi-lab Cross-oncopanel Study Reveals High Sensitivity and Reproducibility Tailored to Targeted Regions and Allele Frequency Ranges
Best practices for oncopanel sequencing, a tool in cancer diagnosis and treatment, requires comprehensive assessments of reproducibility and detection sensitivity. By employing reference materials characterized by the FDA-led SEQC project phase2 (SEQC2), we performed a cross-platform multi-lab evaluation of eight Pan-Cancer panels representing a broad spectrum of oncopanel technologies. The study reveals consistently high sensitivity across targeted high confidence coding regions, variant types (SNVs vs small indels or MNVs) for the variant allele frequency (VAF) above 5%. Sensitivity was reduced by utilizing VAF thresholds due to inherent variability in VAF measurements. Conversely, enforcing a VAF threshold for reporting had a positive impact on reducing false positive (FP) calls. All panels have low FP rates of approximately 1 FP per Mb or less for VAF greater than 5% in the high confidence coding regions, and thus led to good reproducibility. Importantly, the FP rate was found to be noticeably and significantly higher outside the high confidence coding regions, resulting in lower reproducibility. Region restriction and VAF thresholds led to low relative technical variability in estimating the promising biomarkers such as tumor mutational burden. This study details actionable insights into factors underpinning the sensitivity and reproducibility of oncopanel sequencing.
Binsheng Gong, PhD
Staff Fellow, Division of Bioinformatics and Biostatistics, FDA-NCTR
Joshua Xu, PhD
Branch Chief for Research-to-Review — Division of Bioinformatics and Biostatistics, FDA-NCTR and Executive Secretary, MAQC Society
April 6, 2021, 11:00 am - 12:00 Noon ET
Evaluating the Analytical Validity of Circulating Tumor DNA Sequencing for Precision Oncology
April 7, 2021, 11:00 am - 12:00 Noon ET
Robust Cancer Mutation Detection with Deep Learning Models Derived from Tumor-Normal Sequencing Data
Mohammad Sahraeian, PhD
Senior Bioinformatics Scientist, Roche Sequencing Solutions
Christopher Mason, PhD
Associate Professor; Director, WorldQuant Initiative for Quantitative Prediction Physiology and Biophysics/Feil Family Brain and Mind Institute/Institute for Computational Biomedicine, Weill Cornell Medicine
April 13, 2021, 11:00 am - 12:00 Noon ET
The Epigenome Quality Control (EpiQC) Project
April 14, 2021, 11:00 am - 12:00 Noon ET
Multi-Platform Assessment of DNA Sequencing Performance using Human and Bacterial Reference Genomes in the ABRF Next-Generation Sequencing Study
With a background in evolutionary biology, I researched for my PhD the phylogenomics of Myxozoa, a group of bizarre, microscopic endoparasites of economically critical fish (among other hosts) around the world. I explored the genomic underpinnings of how myxozoans, whose closest living relatives are medusozoans (jellyfish, hydras, cube jellies), became so radically different in biology, ecology, and life history.
Jonathan Foox, PhD
Research Associate in Computational Biomedicine, Weill Cornell Medicine
Huixiao Hong, PhD
Chief, Bioinformatics Branch, FDA-NCTR