Catalogue Search | MBRL

Using high-resolution variant frequencies to empower clinical genome interpretation

by Cook, Stuart A , Minikel, Eric , MacArthur, Daniel in 631/1647/2217/457/649 , 631/208/212 , 631/208/2489

2017

Purpose Whole-exome and whole-genome sequencing have transformed the discovery of genetic variants that cause human Mendelian disease, but discriminating pathogenic from benign variants remains a daunting challenge. Rarity is recognized as a necessary, although not sufficient, criterion for pathogenicity, but frequency cutoffs used in Mendelian analysis are often arbitrary and overly lenient. Recent very large reference datasets, such as the Exome Aggregation Consortium (ExAC), provide an unprecedented opportunity to obtain robust frequency estimates even for very rare variants. Methods We present a statistical framework for the frequency-based filtering of candidate disease-causing variants, accounting for disease prevalence, genetic and allelic heterogeneity, inheritance mode, penetrance, and sampling variance in reference datasets. Results Using the example of cardiomyopathy, we show that our approach reduces by two-thirds the number of candidate variants under consideration in the average exome, without removing true pathogenic variants (false-positive rate<0.001). Conclusion We outline a statistically robust framework for assessing whether a variant is “too common” to be causative for a Mendelian disorder of interest. We present precomputed allele frequency cutoffs for all variants in the ExAC dataset.

Journal Article

Share this book

Add to My Shelf

Recommendations for clinical interpretation of variants found in non-coding regions of the genome

by Heidi L. Rehm , Diana Baralle , Richard D. Bagnall in Binding sites , Bioinformatics , Biomedical and Life Sciences

2022

Background The majority of clinical genetic testing focuses almost exclusively on regions of the genome that directly encode proteins. The important role of variants in non-coding regions in penetrant disease is, however, increasingly being demonstrated, and the use of whole genome sequencing in clinical diagnostic settings is rising across a large range of genetic disorders. Despite this, there is no existing guidance on how current guidelines designed primarily for variants in protein-coding regions should be adapted for variants identified in other genomic contexts. Methods We convened a panel of nine clinical and research scientists with wide-ranging expertise in clinical variant interpretation, with specific experience in variants within non-coding regions. This panel discussed and refined an initial draft of the guidelines which were then extensively tested and reviewed by external groups. Results We discuss considerations specifically for variants in non-coding regions of the genome. We outline how to define candidate regulatory elements, highlight examples of mechanisms through which non-coding region variants can lead to penetrant monogenic disease, and outline how existing guidelines can be adapted for the interpretation of these variants. Conclusions These recommendations aim to increase the number and range of non-coding region variants that can be clinically interpreted, which, together with a compatible phenotype, can lead to new diagnoses and catalyse the discovery of novel disease mechanisms.

Journal Article

Share this book

Add to My Shelf

Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci

by Yuan, Yinyin , Houlston, Richard S. , Henrion, Marc in 13/106 , 14/32 , 38/22

2015

Multiple regulatory elements distant from their targets on the linear genome can influence the expression of a single gene through chromatin looping. Chromosome conformation capture implemented in Hi-C allows for genome-wide agnostic characterization of chromatin contacts. However, detection of functional enhancer–promoter interactions is precluded by its effective resolution that is determined by both restriction fragmentation and sensitivity of the experiment. Here we develop a capture Hi-C (cHi-C) approach to allow an agnostic characterization of these physical interactions on a genome-wide scale. Single-nucleotide polymorphisms associated with complex diseases often reside within regulatory elements and exert effects through long-range regulation of gene expression. Applying this cHi-C approach to 14 colorectal cancer risk loci allows us to identify key long-range chromatin interactions in cis and trans involving these loci. Multiple regulatory elements distant from their targets on the linear genome can influence gene expression through chromatin looping. Here, the authors report an improved chromosome conformation capture approach that can be used to identify long-range chromatin interactions in cancer risk loci.

Journal Article

Share this book

Add to My Shelf

Improving estimates of loss-of-function constraint for short genes

by Whiffin, Nicola in 631/114/2785 , 631/208 , Agriculture

2024

Genetic constraint identifies genes under selection against loss-of-function, but existing methods are inaccurate for shorter genes. A new study overcomes this key limitation to ascribe more confident predictions to all human protein-coding genes.

Journal Article

Share this book

Add to My Shelf

A systematic analysis of splicing variants identifies new diagnoses in the 100,000 Genomes Project

by Blakes, Alexander J. M. , Baralle, Diana , Douglas, Andrew G. L. in Analysis , Annotations , Artificial intelligence

2022

Background Genomic variants which disrupt splicing are a major cause of rare genetic diseases. However, variants which lie outside of the canonical splice sites are difficult to interpret clinically. Improving the clinical interpretation of non-canonical splicing variants offers a major opportunity to uplift diagnostic yields from whole genome sequencing data. Methods Here, we examine the landscape of splicing variants in whole-genome sequencing data from 38,688 individuals in the 100,000 Genomes Project and assess the contribution of non-canonical splicing variants to rare genetic diseases. We use a variant-level constraint metric (the mutability-adjusted proportion of singletons) to identify constrained functional variant classes near exon–intron junctions and at putative splicing branchpoints. To identify new diagnoses for individuals with unsolved rare diseases in the 100,000 Genomes Project, we identified individuals with de novo single-nucleotide variants near exon–intron boundaries and at putative splicing branchpoints in known disease genes. We identified candidate diagnostic variants through manual phenotype matching and confirmed new molecular diagnoses through clinical variant interpretation and functional RNA studies. Results We show that near-splice positions and splicing branchpoints are highly constrained by purifying selection and harbour potentially damaging non-coding variants which are amenable to systematic analysis in sequencing data. From 258 de novo splicing variants in known rare disease genes, we identify 35 new likely diagnoses in probands with an unsolved rare disease. To date, we have confirmed a new diagnosis for six individuals, including four in whom RNA studies were performed. Conclusions Overall, we demonstrate the clinical value of examining non-canonical splicing variants in individuals with unsolved rare diseases.

Journal Article

Share this book

Add to My Shelf

Differences in 5'untranslated regions highlight the importance of translational regulation of dosage sensitive genes

by Martin-Geary, Alexandra C. , Aspden, Julie L. , Schafer, Sebastian in 5' Untranslated Regions , 5’UTR , alleles

2024

Background Untranslated regions (UTRs) are important mediators of post-transcriptional regulation. The length of UTRs and the composition of regulatory elements within them are known to vary substantially across genes, but little is known about the reasons for this variation in humans. Here, we set out to determine whether this variation, specifically in 5’UTRs, correlates with gene dosage sensitivity. Results We investigate 5’UTR length, the number of alternative transcription start sites, the potential for alternative splicing, the number and type of upstream open reading frames (uORFs) and the propensity of 5’UTRs to form secondary structures. We explore how these elements vary by gene tolerance to loss-of-function (LoF; using the LOEUF metric), and in genes where changes in dosage are known to cause disease. We show that LOEUF correlates with 5’UTR length and complexity. Genes that are most intolerant to LoF have longer 5’UTRs, greater TSS diversity, and more upstream regulatory elements than their LoF tolerant counterparts. We show that these differences are evident in disease gene-sets, but not in recessive developmental disorder genes where LoF of a single allele is tolerated. Conclusions Our results confirm the importance of post-transcriptional regulation through 5'UTRs in tight regulation of mRNA and protein levels, particularly for genes where changes in dosage are deleterious and lead to disease. Finally, to support gene-based investigation we release a web-based browser tool, VuTR, that supports exploration of the composition of individual 5'UTRs and the impact of genetic variation within them.

Journal Article

Share this book

Add to My Shelf

CardioClassifier: disease- and gene-specific computational decision support for clinical genome interpretation

by Edwards, Matthew , Buchan, Rachel , Prasad, Sanjay K in Biomedical and Life Sciences , Biomedicine , Cardiomyopathy

2018

Purpose Internationally adopted variant interpretation guidelines from the American College of Medical Genetics and Genomics (ACMG) are generic and require disease-specific refinement. Here we developed CardioClassifier ( http://www.cardioclassifier.org ), a semiautomated decision-support tool for inherited cardiac conditions (ICCs). Methods CardioClassifier integrates data retrieved from multiple sources with user-input case-specific information, through an interactive interface, to support variant interpretation. Combining disease- and gene-specific knowledge with variant observations in large cohorts of cases and controls, we refined 14 computational ACMG criteria and created three ICC-specific rules. Results We benchmarked CardioClassifier on 57 expertly curated variants and show full retrieval of all computational data, concordantly activating 87.3% of rules. A generic annotation tool identified fewer than half as many clinically actionable variants (64/219 vs. 156/219, Fisher’s P = 1.1 × 10 −18 ), with important false positives, illustrating the critical importance of disease and gene-specific annotations. CardioClassifier identified putatively disease-causing variants in 33.7% of 327 cardiomyopathy cases, comparable with leading ICC laboratories. Through addition of manually curated data, variants found in over 40% of cardiomyopathy cases are fully annotated, without requiring additional user-input data. Conclusion CardioClassifier is an ICC-specific decision-support tool that integrates expertly curated computational annotations with case-specific data to generate fast, reproducible, and interactive variant pathogenicity reports, according to best practice guidelines.

Journal Article

Share this book

Add to My Shelf

Quantitative approaches to variant classification increase the yield and precision of genetic testing in Mendelian diseases: the case of hypertrophic cardiomyopathy

by Buchan, Rachel , Mazaika, Erica , Wilk, Alicja in ACMG/AMP guidelines , Adaptation , Algorithms

2019

Background International guidelines for variant interpretation in Mendelian disease set stringent criteria to report a variant as (likely) pathogenic, prioritising control of false-positive rate over test sensitivity and diagnostic yield. Genetic testing is also more likely informative in individuals with well-characterised variants from extensively studied European-ancestry populations. Inherited cardiomyopathies are relatively common Mendelian diseases that allow empirical calibration and assessment of this framework. Methods We compared rare variants in large hypertrophic cardiomyopathy (HCM) cohorts (up to 6179 cases) to reference populations to identify variant classes with high prior likelihoods of pathogenicity, as defined by etiological fraction (EF). We analysed the distribution of variants using a bespoke unsupervised clustering algorithm to identify gene regions in which variants are significantly clustered in cases. Results Analysis of variant distribution identified regions in which variants are significantly enriched in cases and variant location was a better discriminator of pathogenicity than generic computational functional prediction algorithms. Non-truncating variant classes with an EF ≥ 0.95 were identified in five established HCM genes. Applying this approach leads to an estimated 14–20% increase in cases with actionable HCM variants, i.e. variants classified as pathogenic/likely pathogenic that might be used for predictive testing in probands’ relatives. Conclusions When found in a patient confirmed to have disease, novel variants in some genes and regions are empirically shown to have a sufficiently high probability of pathogenicity to support a “likely pathogenic” classification, even without additional segregation or functional data. This could increase the yield of high confidence actionable variants, consistent with the framework and recommendations of current guidelines. The techniques outlined offer a consistent and unbiased approach to variant interpretation for Mendelian disease genetic testing. We propose adaptations to ACMG/AMP guidelines to incorporate such evidence in a quantitative and transparent manner.

Journal Article

Share this book

Add to My Shelf

Disease-specific variant pathogenicity prediction significantly improves variant interpretation in inherited cardiac conditions

by Buchan, Rachel , Barton, Paul J.R. , Mazaika, Erica in Algorithms , Area Under Curve , Biomedical and Life Sciences

2021

Accurate discrimination of benign and pathogenic rare variation remains a priority for clinical genome interpretation. State-of-the-art machine learning variant prioritization tools are imprecise and ignore important parameters defining gene–disease relationships, e.g., distinct consequences of gain-of-function versus loss-of-function variants. We hypothesized that incorporating disease-specific information would improve tool performance. We developed a disease-specific variant classifier, CardioBoost, that estimates the probability of pathogenicity for rare missense variants in inherited cardiomyopathies and arrhythmias. We assessed CardioBoost’s ability to discriminate known pathogenic from benign variants, prioritize disease-associated variants, and stratify patient outcomes. CardioBoost has high global discrimination accuracy (precision recall area under the curve [AUC] 0.91 for cardiomyopathies; 0.96 for arrhythmias), outperforming existing tools (4–24% improvement). CardioBoost obtains excellent accuracy (cardiomyopathies 90.2%; arrhythmias 91.9%) for variants classified with >90% confidence, and increases the proportion of variants classified with high confidence more than twofold compared with existing tools. Variants classified as disease-causing are associated with both disease status and clinical severity, including a 21% increased risk (95% confidence interval [CI] 11–29%) of severe adverse outcomes by age 60 in patients with hypertrophic cardiomyopathy. A disease-specific variant classifier outperforms state-of-the-art genome-wide tools for rare missense variants in inherited cardiac conditions (https://www.cardiodb.org/cardioboost/), highlighting broad opportunities for improved pathogenicity prediction through disease specificity.

Journal Article

Share this book

Add to My Shelf

Genetic constraint at single amino acid resolution in protein domains improves missense variant prioritisation and gene discovery

by Zhang, Xiaolei , Ware, James S. , Wright, Caroline F. in Amino acids , Bioinformatics , Biomedical and Life Sciences

2024

Background One of the major hurdles in clinical genetics is interpreting the clinical consequences associated with germline missense variants in humans. Recent significant advances have leveraged natural variation observed in large-scale human populations to uncover genes or genomic regions that show a depletion of natural variation, indicative of selection pressure. We refer to this as “genetic constraint”. Although existing genetic constraint metrics have been demonstrated to be successful in prioritising genes or genomic regions associated with diseases, their spatial resolution is limited in distinguishing pathogenic variants from benign variants within genes. Methods We aim to identify missense variants that are significantly depleted in the general human population. Given the size of currently available human populations with exome or genome sequencing data, it is not possible to directly detect depletion of individual missense variants, since the average expected number of observations of a variant at most positions is less than one. We instead focus on protein domains, grouping homologous variants with similar functional impacts to examine the depletion of natural variations within these comparable sets. To accomplish this, we develop the Homologous Missense Constraint (HMC) score. We utilise the Genome Aggregation Database (gnomAD) 125 K exome sequencing data and evaluate genetic constraint at quasi amino-acid resolution by combining signals across protein homologues. Results We identify one million possible missense variants under strong negative selection within protein domains. Though our approach annotates only protein domains, it nonetheless allows us to assess 22% of the exome confidently. It precisely distinguishes pathogenic variants from benign variants for both early-onset and adult-onset disorders. It outperforms existing constraint metrics and pathogenicity meta-predictors in prioritising de novo mutations from probands with developmental disorders (DD). It is also methodologically independent of these, adding power to predict variant pathogenicity when used in combination. We demonstrate utility for gene discovery by identifying seven genes newly significantly associated with DD that could act through an altered-function mechanism. Conclusions Grouping variants of comparable functional impacts is effective in evaluating their genetic constraint. HMC is a novel and accurate predictor of missense consequence for improved variant interpretation.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter