Catalogue Search | MBRL

Loss-of-function, gain-of-function and dominant-negative mutations have profoundly different effects on protein structure

by Livesey, Benjamin J. , Gerasimavicius, Lukas , Marsh, Joseph A. in 631/114/663 , 631/208/737 , Computer applications

2022

Most known pathogenic mutations occur in protein-coding regions of DNA and change the way proteins are made. Taking protein structure into account has therefore provided great insight into the molecular mechanisms underlying human genetic disease. While there has been much focus on how mutations can disrupt protein structure and thus cause a loss of function (LOF), alternative mechanisms, specifically dominant-negative (DN) and gain-of-function (GOF) effects, are less understood. Here, we investigate the protein-level effects of pathogenic missense mutations associated with different molecular mechanisms. We observe striking differences between recessive vs dominant, and LOF vs non-LOF mutations, with dominant, non-LOF disease mutations having much milder effects on protein structure, and DN mutations being highly enriched at protein interfaces. We also find that nearly all computational variant effect predictors, even those based solely on sequence conservation, underperform on non-LOF mutations. However, we do show that non-LOF mutations could potentially be identified by their tendency to cluster in three-dimensional space. Overall, our work suggests that many pathogenic mutations that act via DN and GOF mechanisms are likely being missed by current variant prioritisation strategies, but that there is considerable scope to improve computational predictions through consideration of molecular disease mechanisms. Most known pathogenic mutations occur in protein-coding regions of DNA and change the way proteins are made. Here the authors analyse the locations of thousands of human disease mutations and their predicted effects on protein structure and show that,while loss-of-function mutations tend to be highly disruptive, non-loss-of-function mutations are in general much milder at a protein structural level.

Journal Article

Share this book

Add to My Shelf

Using deep mutational scanning to benchmark variant effect predictors and identify disease mutations

by Livesey, Benjamin J , Marsh, Joseph A in Amino acids , Benchmarks , Computer applications

2020

To deal with the huge number of novel protein‐coding variants identified by genome and exome sequencing studies, many computational variant effect predictors (VEPs) have been developed. Such predictors are often trained and evaluated using different variant data sets, making a direct comparison between VEPs difficult. In this study, we use 31 previously published deep mutational scanning (DMS) experiments, which provide quantitative, independent phenotypic measurements for large numbers of single amino acid substitutions, in order to benchmark and compare 46 different VEPs. We also evaluate the ability of DMS measurements and VEPs to discriminate between pathogenic and benign missense variants. We find that DMS experiments tend to be superior to the top‐ranking predictors, demonstrating the tremendous potential of DMS for identifying novel human disease mutations. Among the VEPs, DeepSequence clearly stood out, showing both the strongest correlations with DMS data and having the best ability to predict pathogenic mutations, which is especially remarkable given that it is an unsupervised method. We further recommend SNAP2, DEOGEN2, SNPs&GO, SuSPect and REVEL based upon their performance in these analyses. Synopsis Data from deep mutational scans is used to benchmark computational protein variant effect predictors using fully independent data. The performance of deep mutational scanning is also compared to computational predictors for identifying pathogenic variants. DeepSequence is the method that correlates the best with deep mutational scanning data for human proteins. Predictor performance depends heavily on the protein and fitness metric. For this reason, using results from multiple predictors is recommended. Other recommended predictors include SNAP2, DEOGEN2, SNPs&GO, SuSPect and REVEL. Deep mutational scanning is generally superior to variant effect predictors for distinguishing pathogenic from benign variants. Graphical Abstract Data from deep mutational scans is used to benchmark computational protein variant effect predictors using fully independent data. The performance of deep mutational scanning is also compared to computational predictors for identifying pathogenic variants.

Journal Article

Share this book

Add to My Shelf

Updated benchmarking of variant effect predictors using deep mutational scanning

by Livesey, Benjamin J , Marsh, Joseph A in Amino acids , Benchmark , Benchmarking

2023

The assessment of variant effect predictor (VEP) performance is fraught with biases introduced by benchmarking against clinical observations. In this study, building on our previous work, we use independently generated measurements of protein function from deep mutational scanning (DMS) experiments for 26 human proteins to benchmark 55 different VEPs, while introducing minimal data circularity. Many top‐performing VEPs are unsupervised methods including EVE, DeepSequence and ESM‐1v, a protein language model that ranked first overall. However, the strong performance of recent supervised VEPs, in particular VARITY, shows that developers are taking data circularity and bias issues seriously. We also assess the performance of DMS and unsupervised VEPs for discriminating between known pathogenic and putatively benign missense variants. Our findings are mixed, demonstrating that some DMS datasets perform exceptionally at variant classification, while others are poor. Notably, we observe a striking correlation between VEP agreement with DMS data and performance in identifying clinically relevant variants, strongly supporting the validity of our rankings and the utility of DMS for independent benchmarking. Synopsis Common sources of bias in variant effect predictor benchmarking are assessed using data from deep mutational scanning experiments. ESM‐1v, EVE and DeepSequence are among the top performers on both functionally validated and clinically observed variants. Deep mutational scanning datasets from 26 human proteins are used to benchmark 55 computational predictors of missense variant effect. The top‐performing methods include several very recent predictors and are based mostly on unsupervised machine learning methodologies. There is a strong correlation between predictor performance when benchmarked against deep mutational scanning data and clinical variants. Graphical Abstract Common sources of bias in variant effect predictor benchmarking are assessed using data from deep mutational scanning experiments. ESM‐1v, EVE and DeepSequence are among the top performers on both functionally validated and clinically observed variants.

Journal Article

Share this book

Add to My Shelf

Proteome-scale prediction of molecular mechanisms underlying dominant genetic diseases

by Marsh, Joseph A. , Badonyi, Mihaly in Analysis , Binary codes , Biological effects

2024

Many dominant genetic disorders result from protein-altering mutations, acting primarily through dominant-negative (DN), gain-of-function (GOF), and loss-of-function (LOF) mechanisms. Deciphering the mechanisms by which dominant diseases exert their effects is often experimentally challenging and resource intensive, but is essential for developing appropriate therapeutic approaches. Diseases that arise via a LOF mechanism are more amenable to be treated by conventional gene therapy, whereas DN and GOF mechanisms may require gene editing or targeting by small molecules. Moreover, pathogenic missense mutations that act via DN and GOF mechanisms are more difficult to identify than those that act via LOF using nearly all currently available variant effect predictors. Here, we introduce a tripartite statistical model made up of support vector machine binary classifiers trained to predict whether human protein coding genes are likely to be associated with DN, GOF, or LOF molecular disease mechanisms. We test the utility of the predictions by examining biologically and clinically meaningful properties known to be associated with the mechanisms. Our results strongly support that the models are able to generalise on unseen data and offer insight into the functional attributes of proteins associated with different mechanisms. We hope that our predictions will serve as a springboard for researchers studying novel variants and those of uncertain clinical significance, guiding variant interpretation strategies and experimental characterisation. Predictions for the human UniProt reference proteome are available at https://osf.io/z4dcp/ .

Journal Article

Share this book

Add to My Shelf

Identification of pathogenic missense mutations using protein stability predictors

by Gerasimavicius, Lukas , Marsh, Joseph A. , Liu, Xin in 631/114 , 631/114/470 , 631/208/212

2020

Attempts at using protein structures to identify disease-causing mutations have been dominated by the idea that most pathogenic mutations are disruptive at a structural level. Therefore, computational stability predictors, which assess whether a mutation is likely to be stabilising or destabilising to protein structure, have been commonly used when evaluating new candidate disease variants, despite not having been developed specifically for this purpose. We therefore tested 13 different stability predictors for their ability to discriminate between pathogenic and putatively benign missense variants. We find that one method, FoldX, significantly outperforms all other predictors in the identification of disease variants. Moreover, we demonstrate that employing predicted absolute energy change scores improves performance of nearly all predictors in distinguishing pathogenic from benign variants. Importantly, however, we observe that the utility of computational stability predictors is highly heterogeneous across different proteins, and that they are all inferior to the best performing variant effect predictors for identifying pathogenic mutations. We suggest that this is largely due to alternate molecular mechanisms other than protein destabilisation underlying many pathogenic mutations. Thus, better ways of incorporating protein structural information and molecular mechanisms into computational variant effect predictors will be required for improved disease variant prioritisation.

Journal Article

Share this book

Add to My Shelf

Assessing variant effect predictors and disease mechanisms in intrinsically disordered proteins

by Fawzy, Mohamed , Marsh, Joseph A. in Biology and Life Sciences , Computational Biology - methods , Discordance

2025

Intrinsically disordered regions (IDRs) are central to diverse cellular processes but present unique challenges for interpreting genetic variants implicated in human disease. Unlike structured protein domains, IDRs lack stable three-dimensional conformations and are often involved in regulation through transient interactions and post-translational modifications. These features can affect both the distribution of pathogenic variants and the performance of computational tools used to predict their effects. Here, we systematically assessed the distribution of pathogenic vs benign missense variants across disordered, intermediate, and structured protein regions in the human proteome. Pathogenic variants were notably depleted in IDRs yet were associated with distinct molecular mechanisms, particularly dominant gain- and loss-of-function effects. We evaluated 33 variant effect predictors (VEPs), revealing widespread reductions in sensitivity for pathogenic variants in IDRs, despite high AUROC scores driven by accurate benign variant predictions. We also observed substantial discordance among VEP classifications in disordered regions, underscoring the need for region-aware thresholds and disorder-informed prediction strategies. Incorporating features reflective of IDR biology, such as transient interaction motifs and modification sites, may enhance the accuracy and interpretability of future tools.

Journal Article

Share this book

Add to My Shelf

The properties of human disease mutations at protein interfaces

by Livesey, Benjamin J. , Marsh, Joseph A. in Amino acid sequence , Amino acids , Biology and Life Sciences

2022

The assembly of proteins into complexes and their interactions with other biomolecules are often vital for their biological function. While it is known that mutations at protein interfaces have a high potential to be damaging and cause human genetic disease, there has been relatively little consideration for how this varies between different types of interfaces. Here we investigate the properties of human pathogenic and putatively benign missense variants at homomeric (isologous and heterologous), heteromeric, DNA, RNA and other ligand interfaces, and at different regions in proteins with respect to those interfaces. We find that different types of interfaces vary greatly in their propensity to be associated with pathogenic mutations, with homomeric heterologous and DNA interfaces being particularly enriched in disease. We also find that residues that do not directly participate in an interface, but are close in three-dimensional space, show a significant disease enrichment. Finally, we observe that mutations at different types of interfaces tend to have distinct property changes when undergoing amino acid substitutions associated with disease, and that this is linked to substantial variability in their identification by computational variant effect predictors.

Journal Article

Share this book

Add to My Shelf

Variant effect predictor correlation with functional assays is reflective of clinical classification performance

by Livesey, Benjamin J. , Marsh, Joseph A. in Amino acid sequence , amino acid sequences , Amino acids

2025

Background Understanding the relationship between protein sequence and function is crucial for accurate classification of missense variants. Variant effect predictors (VEPs) play a vital role in deciphering this complex relationship, yet evaluating their performance remains challenging for several reasons, including data circularity, where the same or related data is used for training and assessment. High-throughput experimental strategies like deep mutational scanning (DMS) offer a promising solution. Results In this study, we extend upon our previous benchmarking approach, assessing the performance of 97 VEPs using missense DMS measurements from 36 different human proteins. In addition, a new pairwise, VEP-centric approach mitigates the impact of missing predictions on overall performance comparison. We observe a strong correspondence between VEP performance in DMS-based benchmarks and clinical variant classification, especially for predictors that have not been directly trained on human clinical variants. Conclusions Our results suggest that comparing VEP performance against diverse functional assays represents a reliable strategy for assessing their relative performance in clinical variant classification. However, major challenges in clinical interpretation of VEP scores persist, highlighting the need for further research to fully leverage computational predictors for genetic diagnosis. We also address practical considerations for end users in terms of choice of methodology.

Journal Article

Share this book

Add to My Shelf

Alpha Helices Are More Robust to Mutations than Beta Strands

by Abrusán, György , Marsh, Joseph A. in Amino Acid Sequence , Amino acids , Biology and Life Sciences

2016

The rapidly increasing amount of data on human genetic variation has resulted in a growing demand to identify pathogenic mutations computationally, as their experimental validation is currently beyond reach. Here we show that alpha helices and beta strands differ significantly in their ability to tolerate mutations: helices can accumulate more mutations than strands without change, due to the higher numbers of inter-residue contacts in helices. This results in two patterns: a) the same number of mutations causes less structural change in helices than in strands; b) helices diverge more rapidly in sequence than strands within the same domains. Additionally, both helices and strands are significantly more robust than coils. Based on this observation we show that human missense mutations that change secondary structure are more likely to be pathogenic than those that do not. Moreover, inclusion of predicted secondary structure changes shows significant utility for improving upon state-of-the-art pathogenicity predictions.

Journal Article

Share this book

Add to My Shelf

Prevalence of loss-of-function, gain-of-function and dominant-negative mechanisms across genetic disease phenotypes

by Marsh, Joseph A. , Badonyi, Mihaly in 631/114 , 631/208/737 , 631/535

2025

Molecular disease mechanisms caused by mutations in protein-coding regions are diverse, but they can be broadly categorised into loss-of-function, gain-of-function and dominant-negative effects. Accurately predicting these mechanisms is important, since therapeutic strategies can exploit these mechanisms. Computational predictors tend to perform less well at the identification of pathogenic gain-of-function and dominant-negative variants. Here, we develop a protein structure-based missense loss-of-function likelihood score that can separate recessive loss of function and dominant loss of function from alternative disease mechanisms. Using missense loss-of-function scores, we estimate the prevalence of molecular mechanisms across 2,837 phenotypes in 1,979 Mendelian disease genes, finding that dominant-negative and gain-of-function mechanisms account for 48% of phenotypes in dominant genes. Applying missense loss-of-function scores to genes with multiple phenotypes reveals widespread intragenic mechanistic heterogeneity, with 43% of dominant and 49% of mixed-inheritance genes harbouring both loss-of-function and non-loss-of-function mechanisms. Furthermore, we show that combining missense loss-of-function scores with phenotype semantic similarity enables the prioritisation of dominant-negative mechanisms in mixed-inheritance genes. Our structure-based approach, accessible via a Google Colab notebook, offers a scalable tool for predicting disease mechanisms and advancing personalised medicine. Protein structures can help determine the disease-causing mechanisms of mutations. Here, the authors use a protein structure-based approach to show that nearly half of dominant genetic conditions result in non-simple loss-of-function effects.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter