Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
56 result(s) for "Babbitt, Patricia C"
Sort by:
Evolutionary and molecular foundations of multiple contemporary functions of the nitroreductase superfamily
Insight regarding how diverse enzymatic functions and reactions have evolved from ancestral scaffolds is fundamental to understanding chemical and evolutionary biology, and for the exploitation of enzymes for biotechnology. We undertook an extensive computational analysis using a unique and comprehensive combination of tools that include large-scale phylogenetic reconstruction to determine the sequence, structural, and functional relationships of the functionally diverse flavin mononucleotide-dependent nitroreductase (NTR) superfamily (>24,000 sequences from all domains of life, 54 structures, and >10 enzymatic functions). Our results suggest an evolutionary model in which contemporary subgroups of the superfamily have diverged in a radial manner from a minimal flavin-binding scaffold. We identified the structural design principle for this divergence: Insertions at key positions in the minimal scaffold that, combined with the fixation of key residues, have led to functional specialization. These results will aid future efforts to delineate the emergence of functional diversity in enzyme superfamilies, provide clues for functional inference for superfamily members of unknown function, and facilitate rational redesign of the NTR scaffold.
Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies
Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with \"overprediction\" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.
Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies
The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.
An Atlas of the Thioredoxin Fold Class Reveals the Complexity of Function-Enabling Adaptations
The group of proteins that contain a thioredoxin (Trx) fold is huge and diverse. Assessment of the variation in catalytic machinery of Trx fold proteins is essential in providing a foundation for understanding their functional diversity and predicting the function of the many uncharacterized members of the class. The proteins of the Trx fold class retain common features-including variations on a dithiol CxxC active site motif-that lead to delivery of function. We use protein similarity networks to guide an analysis of how structural and sequence motifs track with catalytic function and taxonomic categories for 4,082 representative sequences spanning the known superfamilies of the Trx fold. Domain structure in the fold class is varied and modular, with 2.8% of sequences containing more than one Trx fold domain. Most member proteins are bacterial. The fold class exhibits many modifications to the CxxC active site motif-only 56.8% of proteins have both cysteines, and no functional groupings have absolute conservation of the expected catalytic motif. Only a small fraction of Trx fold sequences have been functionally characterized. This work provides a global view of the complex distribution of domains and catalytic machinery throughout the fold class, showing that each superfamily contains remnants of the CxxC active site. The unifying context provided by this work can guide the comparison of members of different Trx fold superfamilies to gain insight about their structure-function relationships, illustrated here with the thioredoxins and peroxiredoxins.
Large-Scale Determination of Sequence, Structure, and Function Relationships in Cytosolic Glutathione Transferases across the Biosphere
The cytosolic glutathione transferase (cytGST) superfamily comprises more than 13,000 nonredundant sequences found throughout the biosphere. Their key roles in metabolism and defense against oxidative damage have led to thousands of studies over several decades. Despite this attention, little is known about the physiological reactions they catalyze and most of the substrates used to assay cytGSTs are synthetic compounds. A deeper understanding of relationships across the superfamily could provide new clues about their functions. To establish a foundation for expanded classification of cytGSTs, we generated similarity-based subgroupings for the entire superfamily. Using the resulting sequence similarity networks, we chose targets that broadly covered unknown functions and report here experimental results confirming GST-like activity for 82 of them, along with 37 new 3D structures determined for 27 targets. These new data, along with experimentally known GST reactions and structures reported in the literature, were painted onto the networks to generate a global view of their sequence-structure-function relationships. The results show how proteins of both known and unknown function relate to each other across the entire superfamily and reveal that the great majority of cytGSTs have not been experimentally characterized or annotated by canonical class. A mapping of taxonomic classes across the superfamily indicates that many taxa are represented in each subgroup and highlights challenges for classification of superfamily sequences into functionally relevant classes. Experimental determination of disulfide bond reductase activity in many diverse subgroups illustrate a theme common for many reaction types. Finally, sequence comparison between an enzyme that catalyzes a reductive dechlorination reaction relevant to bioremediation efforts with some of its closest homologs reveals differences among them likely to be associated with evolution of this unusual reaction. Interactive versions of the networks, associated with functional and other types of information, can be downloaded from the Structure-Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu).
Biases in the Experimental Annotations of Protein Function and Their Effect on Our Understanding of Protein Function Space
The ongoing functional annotation of proteins relies upon the work of curators to capture experimental findings from scientific literature and apply them to protein sequence and structure data. However, with the increasing use of high-throughput experimental assays, a small number of experimental studies dominate the functional protein annotations collected in databases. Here, we investigate just how prevalent is the \"few articles - many proteins\" phenomenon. We examine the experimentally validated annotation of proteins provided by several groups in the GO Consortium, and show that the distribution of proteins per published study is exponential, with 0.14% of articles providing the source of annotations for 25% of the proteins in the UniProt-GOA compilation. Since each of the dominant articles describes the use of an assay that can find only one function or a small group of functions, this leads to substantial biases in what we know about the function of many proteins. Mass-spectrometry, microscopy and RNAi experiments dominate high throughput experiments. Consequently, the functional information derived from these experiments is mostly of the subcellular location of proteins, and of the participation of proteins in embryonic developmental pathways. For some organisms, the information provided by different studies overlap by a large amount. We also show that the information provided by high throughput experiments is less specific than those provided by low throughput experiments. Given the experimental techniques available, certain biases in protein function annotation due to high-throughput experiments are unavoidable. Knowing that these biases exist and understanding their characteristics and extent is important for database curators, developers of function annotation programs, and anyone who uses protein function annotation data to plan experiments.
Discovery of new enzymes and metabolic pathways by using structure and genome context
Pathway docking ( in silico docking of metabolites to several enzymes and binding proteins in a metabolic pathway) enables the discovery of a catabolic pathway for the osmolyte trans -4-hydroxy- l -proline betaine. Structural key to predicting enzyme function Overprediction and database annotation errors in genome-sequencing projects have caused much confusion because of the difficulty of assigning valid functions to the proteins identified. These authors use structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster to correctly predict the in vitro activity of an enzyme of unknown function and identify the catabolic pathway in which it participates in cells. The substrate-liganded pose predicted by virtual library screening was confirmed experimentally, enzyme activities in the predicted pathway were confirmed by in vitro assays and genetic analyses, the intermediates were identified by metabolomics, and repression of the genes encoding the pathway by high salt concentrations was established by transcriptomics. This study establishes the utility of structure-guided functional predictions for the discovery of new metabolic pathways. Assigning valid functions to proteins identified in genome projects is challenging: overprediction and database annotation errors are the principal concerns 1 . We and others 2 are developing computation-guided strategies for functional discovery with ‘metabolite docking’ to experimentally derived 3 or homology-based 4 three-dimensional structures. Bacterial metabolic pathways often are encoded by ‘genome neighbourhoods’ (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by ‘predicting’ the intermediates in the glycolytic pathway in Escherichia coli 5 . Metabolite docking to multiple binding proteins and enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. Here we report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans -4-hydroxy- l -proline betaine (tHyp-B) and cis -4-hydroxy- d -proline betaine (cHyp-B), and also the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt concentrations was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guided functional predictions to enable the discovery of new metabolic pathways.
Molecular Diversity of Terpene Synthases in the Liverwort Marchantia polymorpha
Marchantia polymorpha is a basal terrestrial land plant, which like most liverworts accumulates structurally diverse terpenes believed to serve in deterring disease and herbivory. Previous studies have suggested that the mevalonate and methylerythritol phosphate pathways, present in evolutionarily diverged plants, are also operative in liverworts. However, the genes and enzymes responsible for the chemical diversity of terpenes have yet to be described. In this study, we resorted to a HMMER search tool to identify 17 putative terpene synthase genes from M. polymorpha transcriptomes. Functional characterization identified four diterpene synthase genes phylogenetically related to those found in diverged plants and nine rather unusual monoterpene and sesquiterpene synthase-like genes. The presence of separate monofunctional diterpene synthases for ent-copalyl diphosphate and ent-kaurene biosynthesis is similar to orthologs found in vascular plants, pushing the date of the underlying gene duplication and neofunctionalization of the ancestral diterpene synthase gene family to >400 million years ago. By contrast, the mono- and sesquiterpene synthases represent a distinct class of enzymes, not related to previously described plant terpene synthases and only distantly so to microbial-type terpene synthases. The absence of a Mg2+ binding, aspartate-rich, DDXXD motif places these enzymes in a noncanonical family of terpene synthases.
DIVERGENT EVOLUTION OF ENZYMATIC FUNCTION: Mechanistically Diverse Superfamilies and Functionally Distinct Suprafamilies
The protein sequence and structure databases are now sufficiently representative that strategies nature uses to evolve new catalytic functions can be identified. Groups of divergently related enzymes whose members catalyze different reactions but share a common partial reaction, intermediate, or transition state (mechanistically diverse superfamilies) have been discovered, including the enolase, amidohydrolase, thiyl radical, crotonase, vicinal-oxygen-chelate, and Fe-dependent oxidase superfamilies. Other groups of divergently related enzymes whose members catalyze different overall reactions that do not share a common mechanistic strategy (functionally distinct supra families) have also been identified: ( a ) functionally distinct suprafamilies whose members catalyze successive transformations in the tryptophan and histidine biosynthetic pathways and ( b ) functionally distinct suprafamilies whose members catalyze different reactions in different metabolic pathways. An understanding of the structural bases for the catalytic diversity observed in super- and suprafamilies may provide the basis for discovering the functions of proteins and enzymes in new genomes as well as provide guidance for in vitro evolution/engineering of new enzymes.