Catalogue Search | MBRL

An Atlas of Variant Effects to understand the genome at nucleotide resolution

by Hahn, William C. , Roth, Frederick P. , Adams, David J. in Animal Genetics and Genomics , Bioinformatics , Biology

2023

Sequencing has revealed hundreds of millions of human genetic variants, and continued efforts will only add to this variant avalanche. Insufficient information exists to interpret the effects of most variants, limiting opportunities for precision medicine and comprehension of genome function. A solution lies in experimental assessment of the functional effect of variants, which can reveal their biological and clinical impact. However, variant effect assays have generally been undertaken reactively for individual variants only after and, in most cases long after, their first observation. Now, multiplexed assays of variant effect can characterise massive numbers of variants simultaneously, yielding variant effect maps that reveal the function of every possible single nucleotide change in a gene or regulatory element. Generating maps for every protein encoding gene and regulatory element in the human genome would create an ‘Atlas’ of variant effect maps and transform our understanding of genetics and usher in a new era of nucleotide-resolution functional knowledge of the genome. An Atlas would reveal the fundamental biology of the human genome, inform human evolution, empower the development and use of therapeutics and maximize the utility of genomics for diagnosing and treating disease. The Atlas of Variant Effects Alliance is an international collaborative group comprising hundreds of researchers, technologists and clinicians dedicated to realising an Atlas of Variant Effects to help deliver on the promise of genomics.

Journal Article

Share this book

Add to My Shelf

Cross-protein transfer learning substantially improves disease variant prediction

by Ye, Chengzhong , Koehl, Antoine , Ioannidis, Nilah in Amino Acid Sequence , amino acid sequences , Animal Genetics and Genomics

2023

Background Genetic variation in the human genome is a major determinant of individual disease risk, but the vast majority of missense variants have unknown etiological effects. Here, we present a robust learning framework for leveraging saturation mutagenesis experiments to construct accurate computational predictors of proteome-wide missense variant pathogenicity. Results We train cross-protein transfer (CPT) models using deep mutational scanning (DMS) data from only five proteins and achieve state-of-the-art performance on clinical variant interpretation for unseen proteins across the human proteome. We also improve predictive accuracy on DMS data from held-out proteins. High sensitivity is crucial for clinical applications and our model CPT-1 particularly excels in this regime. For instance, at 95% sensitivity of detecting human disease variants annotated in ClinVar, CPT-1 improves specificity to 68%, from 27% for ESM-1v and 55% for EVE. Furthermore, for genes not used to train REVEL, a supervised method widely used by clinicians, we show that CPT-1 compares favorably with REVEL. Our framework combines predictive features derived from general protein sequence models, vertebrate sequence alignments, and AlphaFold structures, and it is adaptable to the future inclusion of other sources of information. We find that vertebrate alignments, albeit rather shallow with only 100 genomes, provide a strong signal for variant pathogenicity prediction that is complementary to recent deep learning-based models trained on massive amounts of protein sequence data. We release predictions for all possible missense variants in 90% of human genes. Conclusions Our results demonstrate the utility of mutational scanning data for learning properties of variants that transfer to unseen proteins.

Journal Article

Share this book

Add to My Shelf

A comprehensive map of human glucokinase variant activity

by Roth, Frederick P. , Li, Roujia , Gebbia, Marinella in active sites , Animal Genetics and Genomics , Bioinformatics

2023

Background Glucokinase (GCK) regulates insulin secretion to maintain appropriate blood glucose levels. Sequence variants can alter GCK activity to cause hyperinsulinemic hypoglycemia or hyperglycemia associated with GCK-maturity-onset diabetes of the young (GCK-MODY), collectively affecting up to 10 million people worldwide. Patients with GCK-MODY are frequently misdiagnosed and treated unnecessarily. Genetic testing can prevent this but is hampered by the challenge of interpreting novel missense variants. Result Here, we exploit a multiplexed yeast complementation assay to measure both hyper- and hypoactive GCK variation, capturing 97% of all possible missense and nonsense variants. Activity scores correlate with in vitro catalytic efficiency, fasting glucose levels in carriers of GCK variants and with evolutionary conservation. Hypoactive variants are concentrated at buried positions, near the active site, and at a region of known importance for GCK conformational dynamics. Some hyperactive variants shift the conformational equilibrium towards the active state through a relative destabilization of the inactive conformation. Conclusion Our comprehensive assessment of GCK variant activity promises to facilitate variant interpretation and diagnosis, expand our mechanistic understanding of hyperactive variants, and inform development of therapeutics targeting GCK.

Journal Article

Share this book

Add to My Shelf

DIMPLE: deep insertion, deletion, and missense mutation libraries for exploring protein variation in evolution, disease, and biology

by Macdonald, Christian B. , Trinidad, Donovan , Coyote-Maestas, Willow in Amino acids , Animal Genetics and Genomics , Bias

2023

Insertions and deletions (indels) enable evolution and cause disease. Due to technical challenges, indels are left out of most mutational scans, limiting our understanding of them in disease, biology, and evolution. We develop a low cost and bias method, DIMPLE, for systematically generating deletions, insertions, and missense mutations in genes, which we test on a range of targets, including Kir2.1. We use DIMPLE to study how indels impact potassium channel structure, disease, and evolution. We find deletions are most disruptive overall, beta sheets are most sensitive to indels, and flexible loops are sensitive to deletions yet tolerate insertions.

Journal Article

Share this book

Add to My Shelf

Leveraging massively parallel reporter assays for evolutionary questions

by Gallego Romero, Irene , Lea, Amanda J. in Animal Genetics and Genomics , Bioinformatics , Biomedical and Life Sciences

2023

A long-standing goal of evolutionary biology is to decode how gene regulation contributes to organismal diversity. Doing so is challenging because it is hard to predict function from non-coding sequence and to perform molecular research with non-model taxa. Massively parallel reporter assays (MPRAs) enable the testing of thousands to millions of sequences for regulatory activity simultaneously. Here, we discuss the execution, advantages, and limitations of MPRAs, with a focus on evolutionary questions. We propose solutions for extending MPRAs to rare taxa and those with limited genomic resources, and we underscore MPRA’s broad potential for driving genome-scale, functional studies across organisms.

Journal Article

Share this book

Add to My Shelf

High-throughput deep learning variant effect prediction with Sequence UNET

by Beltrao, Pedro , Dunham, Alistair S. , AlQuraishi, Mohammed in Amino acids , Animal Genetics and Genomics , Bioinformatics

2023

Understanding coding mutations is important for many applications in biology and medicine but the vast mutation space makes comprehensive experimental characterisation impossible. Current predictors are often computationally intensive and difficult to scale, including recent deep learning models. We introduce Sequence UNET, a highly scalable deep learning architecture that classifies and predicts variant frequency from sequence alone using multi-scale representations from a fully convolutional compression/expansion architecture. It achieves comparable pathogenicity prediction to recent methods. We demonstrate scalability by analysing 8.3B variants in 904,134 proteins detected through large-scale proteomics. Sequence UNET runs on modest hardware with a simple Python package.

Journal Article

Share this book

Add to My Shelf

Benchmarking splice variant prediction algorithms using massively parallel splicing assays

by Kitzman, Jacob O. , Smith, Cathy in Algorithms , Alternative splicing , Animal Genetics and Genomics

2023

Background Variants that disrupt mRNA splicing account for a sizable fraction of the pathogenic burden in many genetic disorders, but identifying splice-disruptive variants (SDVs) beyond the essential splice site dinucleotides remains difficult. Computational predictors are often discordant, compounding the challenge of variant interpretation. Because they are primarily validated using clinical variant sets heavily biased to known canonical splice site mutations, it remains unclear how well their performance generalizes. Results We benchmark eight widely used splicing effect prediction algorithms, leveraging massively parallel splicing assays (MPSAs) as a source of experimentally determined ground-truth. MPSAs simultaneously assay many variants to nominate candidate SDVs. We compare experimentally measured splicing outcomes with bioinformatic predictions for 3,616 variants in five genes. Algorithms’ concordance with MPSA measurements, and with each other, is lower for exonic than intronic variants, underscoring the difficulty of identifying missense or synonymous SDVs. Deep learning-based predictors trained on gene model annotations achieve the best overall performance at distinguishing disruptive and neutral variants, and controlling for overall call rate genome-wide, SpliceAI and Pangolin have superior sensitivity. Finally, our results highlight two practical considerations when scoring variants genome-wide: finding an optimal score cutoff, and the substantial variability introduced by differences in gene model annotation, and we suggest strategies for optimal splice effect prediction in the face of these issues. Conclusion SpliceAI and Pangolin show the best overall performance among predictors tested, however, improvements in splice effect prediction are still needed especially within exons.

Journal Article

Share this book

Add to My Shelf

Benchmarking computational variant effect predictors by their ability to infer human traits

by Roden, Dan M. , Roth, Frederick P. , Li, Roujia in All of Us , Animal Genetics and Genomics , Benchmarking

2024

Background Computational variant effect predictors offer a scalable and increasingly reliable means of interpreting human genetic variation, but concerns of circularity and bias have limited previous methods for evaluating and comparing predictors. Population-level cohorts of genotyped and phenotyped participants that have not been used in predictor training can facilitate an unbiased benchmarking of available methods. Using a curated set of human gene-trait associations with a reported rare-variant burden association, we evaluate the correlations of 24 computational variant effect predictors with associated human traits in the UK Biobank and All of Us cohorts. Results AlphaMissense outperformed all other predictors in inferring human traits based on rare missense variants in UK Biobank and All of Us participants. The overall rankings of computational variant effect predictors in these two cohorts showed a significant positive correlation. Conclusion We describe a method to assess computational variant effect predictors that sidesteps the limitations of previous evaluations. This approach is generalizable to future predictors and could continue to inform predictor choice for personal and clinical genetics.

Journal Article

Share this book

Add to My Shelf

Saturation-scale functional evidence supports clinical variant interpretation in Lynch syndrome

by Chamberlin, Adam , Karam, Rachid , Kitzman, Jacob O. in Animal Genetics and Genomics , Bioinformatics , Biomedical and Life Sciences

2022

Background Lynch syndrome (LS) is a cancer predisposition syndrome affecting more than 1 in every 300 individuals worldwide. Clinical genetic testing for LS can be life-saving but is complicated by the heavy burden of variants of uncertain significance (VUS), especially missense changes. Result To address this challenge, we leverage a multiplexed analysis of variant effect (MAVE) map covering >94% of the 17,746 possible missense variants in the key LS gene MSH2 . To establish this map’s utility in large-scale variant reclassification, we overlay it on clinical databases of >15,000 individuals with LS gene variants uncovered during clinical genetic testing. We validate these functional measurements in a cohort of individuals with paired tumor-normal test results and find that MAVE-based function scores agree with the clinical interpretation for every one of the MSH2 missense variants with an available classification. We use these scores to attempt reclassification for 682 unique missense VUS, among which 34 scored as deleterious by our function map, in line with previously published rates for other cancer predisposition genes. Combining functional data and other evidence, ten missense VUS are reclassified as pathogenic/likely pathogenic, and another 497 could be moved to benign/likely benign. Finally, we apply these functional scores to paired tumor-normal genetic tests and identify a subset of patients with biallelic somatic loss of function, reflecting a sporadic Lynch-like Syndrome with distinct implications for treatment and relatives’ risk. Conclusion This study demonstrates how high-throughput functional assays can empower scalable VUS resolution and prospectively generate strong evidence for variant classification.

Journal Article

Share this book

Add to My Shelf

Guidelines for releasing a variant effect predictor

by Roth, Frederick P. , Marsh, Joseph A. , Orenbuch, Rose in Algorithms , Animal Genetics and Genomics , Bioinformatics

2025

Computational methods for assessing the likely impacts of mutations, known as variant effect predictors (VEPs), are widely used in the assessment and interpretation of human genetic variation, as well as in other applications like protein engineering. Many different VEPs have been released, and there is tremendous variability in their underlying algorithms, outputs, and the ways in which the methodologies and predictions are shared. This leads to considerable difficulties for users trying to navigate the selection and application of VEPs. Here, to address these issues, we provide guidelines and recommendations for the release of novel VEPs.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter