Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
30,522 result(s) for "Sequence Homology, Amino Acid"
Sort by:
c-Src and c-Abl kinases control hierarchic phosphorylation and function of the CagA effector protein in Western and East Asian Helicobacter pylori strains
Many bacterial pathogens inject into host cells effector proteins that are substrates for host tyrosine kinases such as Src and Abl family kinases. Phosphorylated effectors eventually subvert host cell signaling, aiding disease development. In the case of the gastric pathogen Helicobacter pylori, which is a major risk factor for the development of gastric cancer, the only known effector protein injected into host cells is the oncoprotein CagA. Here, we followed the hierarchic tyrosine phosphorylation of H. pylori CagA as a model system to study early effector phosphorylation processes. Translocated CagA is phosphorylated on Glu-Pro-Ile-Tyr-Ala (EPIYA) motifs EPIYA-A, EPIYA-B, and EPIYA-C in Western strains of H. pylori and EPIYA-A, EPIYA-B, and EPIYA-D in East Asian strains. We found that c-Src only phosphorylated EPIYA-C and EPIYA-D, whereas c-Abl phosphorylated EPIYA-A, EPIYA-B, EPIYA-C, and EPIYA-D. Further analysis revealed that CagA molecules were phosphorylated on 1 or 2 EPIYA motifs, but never simultaneously on 3 motifs. Furthermore, none of the phosphorylated EPIYA motifs alone was sufficient for inducing AGS cell scattering and elongation. The preferred combination of phosphorylated EPIYA motifs in Western strains was EPIYA-A and EPIYA-C, either across 2 CagA molecules or simultaneously on 1. Our study thus identifies a tightly regulated hierarchic phosphorylation model for CagA starting at EPIYA-C/D, followed by phosphorylation of EPIYA-A or EPIYA-B. These results provide insight for clinical H. pylori typing and clarify the role of phosphorylated bacterial effector proteins in pathogenesis.
Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi
Background Genome annotation is of key importance in many research questions. The identification of protein-coding genes is often based on transcriptome sequencing data, ab-initio or homology-based prediction. Recently, it was demonstrated that intron position conservation improves homology-based gene prediction, and that experimental data improves ab-initio gene prediction. Results Here, we present an extension of the gene prediction program GeMoMa that utilizes amino acid sequence conservation, intron position conservation and optionally RNA-seq data for homology-based gene prediction. We show on published benchmark data for plants, animals and fungi that GeMoMa performs better than the gene prediction programs BRAKER1, MAKER2, and CodingQuarry, and purely RNA-seq-based pipelines for transcript identification. In addition, we demonstrate that using multiple reference organisms may help to further improve the performance of GeMoMa. Finally, we apply GeMoMa to four nematode species and to the recently published barley reference genome indicating that current annotations of protein-coding genes may be refined using GeMoMa predictions. Conclusions GeMoMa might be of great utility for annotating newly sequenced genomes but also for finding homologs of a specific gene or gene family. GeMoMa has been published under GNU GPL3 and is freely available at http://www.jstacs.de/index.php/GeMoMa .
Embeddings from deep learning transfer GO annotations beyond homology
Knowing protein function is crucial to advance molecular and medical biology, yet experimental function annotations through the Gene Ontology (GO) exist for fewer than 0.5% of all known proteins. Computational methods bridge this sequence-annotation gap typically through homology-based annotation transfer by identifying sequence-similar proteins with known function or through prediction methods using evolutionary information. Here, we propose predicting GO terms through annotation transfer based on proximity of proteins in the SeqVec embedding rather than in sequence space. These embeddings originate from deep learned language models (LMs) for protein sequences (SeqVec) transferring the knowledge gained from predicting the next amino acid in 33 million protein sequences. Replicating the conditions of CAFA3, our method reaches an F max of 37 ± 2%, 50 ± 3%, and 57 ± 2% for BPO, MFO, and CCO, respectively. Numerically, this appears close to the top ten CAFA3 methods. When restricting the annotation transfer to proteins with < 20% pairwise sequence identity to the query, performance drops (F max BPO 33 ± 2%, MFO 43 ± 3%, CCO 53 ± 2%); this still outperforms naïve sequence-based transfer. Preliminary results from CAFA4 appear to confirm these findings. Overall, this new concept is likely to change the annotation of proteins, in particular for proteins from smaller families or proteins with intrinsically disordered regions.
Modeling protein quaternary structure of homo- and hetero-oligomers beyond binary interactions by homology
Cellular processes often depend on interactions between proteins and the formation of macromolecular complexes. The impairment of such interactions can lead to deregulation of pathways resulting in disease states, and it is hence crucial to gain insights into the nature of macromolecular assemblies. Detailed structural knowledge about complexes and protein-protein interactions is growing, but experimentally determined three-dimensional multimeric assemblies are outnumbered by complexes supported by non-structural experimental evidence. Here, we aim to fill this gap by modeling multimeric structures by homology, only using amino acid sequences to infer the stoichiometry and the overall structure of the assembly. We ask which properties of proteins within a family can assist in the prediction of correct quaternary structure. Specifically, we introduce a description of protein-protein interface conservation as a function of evolutionary distance to reduce the noise in deep multiple sequence alignments. We also define a distance measure to structurally compare homologous multimeric protein complexes. This allows us to hierarchically cluster protein structures and quantify the diversity of alternative biological assemblies known today. We find that a combination of conservation scores, structural clustering, and classical interface descriptors, can improve the selection of homologous protein templates leading to reliable models of protein complexes.
ECNet is an evolutionary context-integrated deep learning framework for protein engineering
Machine learning has been increasingly used for protein engineering. However, because the general sequence contexts they capture are not specific to the protein being engineered, the accuracy of existing machine learning algorithms is rather limited. Here, we report ECNet (evolutionary context-integrated neural network), a deep-learning algorithm that exploits evolutionary contexts to predict functional fitness for protein engineering. This algorithm integrates local evolutionary context from homologous sequences that explicitly model residue-residue epistasis for the protein of interest with the global evolutionary context that encodes rich semantic and structural features from the enormous protein sequence universe. As such, it enables accurate mapping from sequence to function and provides generalization from low-order mutants to higher-order mutants. We show that ECNet predicts the sequence-function relationship more accurately as compared to existing machine learning algorithms by using ~50 deep mutational scanning and random mutagenesis datasets. Moreover, we used ECNet to guide the engineering of TEM-1 β-lactamase and identified variants with improved ampicillin resistance with high success rates. Protein engineering is an active area of research in which machine learning has proven quite powerful. Here, the authors present a deep learning method that integrates both general and protein-specific sequence representations to improve the engineering of one’s protein of interest.
Copy number variation at the GL7 locus contributes to grain size diversity in rice
Jiayang Li, Xudong Zhu, Qian Qian and colleagues report cloning of the Grain Length on Chromosome 7 ( GL7 ) locus in rice and identify a copy number variant that increases grain length and improves grain quality. They demonstrate how interactions with other grain length–related genes may be used to improve breeding. Copy number variants (CNVs) are associated with changes in gene expression levels and contribute to various adaptive traits 1 , 2 . Here we show that a CNV at the Grain Length on Chromosome 7 ( GL7 ) locus contributes to grain size diversity in rice ( Oryza sativa L.). GL7 encodes a protein homologous to Arabidopsis thaliana LONGIFOLIA proteins, which regulate longitudinal cell elongation. Tandem duplication of a 17.1-kb segment at the GL7 locus leads to upregulation of GL7 and downregulation of its nearby negative regulator, resulting in an increase in grain length and improvement of grain appearance quality. Sequence analysis indicates that allelic variants of GL7 and its negative regulator are associated with grain size diversity and that the CNV at the GL7 locus was selected for and used in breeding. Our work suggests that pyramiding beneficial alleles of GL7 and other yield- and quality-related genes may improve the breeding of elite rice varieties.
Structural insight into molecular mechanism of poly(ethylene terephthalate) degradation
Plastics, including poly(ethylene terephthalate) (PET), possess many desirable characteristics and thus are widely used in daily life. However, non-biodegradability, once thought to be an advantage offered by plastics, is causing major environmental problem. Recently, a PET-degrading bacterium, Ideonella sakaiensis , was identified and suggested for possible use in degradation and/or recycling of PET. However, the molecular mechanism of PET degradation is not known. Here we report the crystal structure of I. sakaiensis PETase ( Is PETase) at 1.5 Å resolution. Is PETase has a Ser–His-Asp catalytic triad at its active site and contains an optimal substrate binding site to accommodate four monohydroxyethyl terephthalate (MHET) moieties of PET. Based on structural and site-directed mutagenesis experiments, the detailed process of PET degradation into MHET, terephthalic acid, and ethylene glycol is suggested. Moreover, other PETase candidates potentially having high PET-degrading activities are suggested based on phylogenetic tree analysis of 69 PETase-like proteins. Poly(ethylene terephthalate) (PET) is a widely used plastic and its accumulation in the environment has become global problem. Here the authors report the crystal structure of a Ideonella sakaiensis PET-degrading enzyme and propose a molecular mechanism for PET degradation.
Natural variation at the soybean J locus improves adaptation to the tropics and enhances yield
Fanjiang Kong, Zhixi Tian, Xingliang Hou, Baohui Liu and colleagues report the cloning and functional characterization of J , the locus underlying the long-juvenile (LJ) trait that has enabled tropical cultivation of soybean. They show that J, an ortholog of Arabidopsis ELF3, downregulates the expression of E1 , thereby promoting flowering under short-day conditions. Soybean is a major legume crop originating in temperate regions, and photoperiod responsiveness is a key factor in its latitudinal adaptation. Varieties from temperate regions introduced to lower latitudes mature early and have extremely low grain yields. Introduction of the long-juvenile (LJ) trait extends the vegetative phase and improves yield under short-day conditions, thereby enabling expansion of cultivation in tropical regions. Here we report the cloning and characterization of J , the major classical locus conferring the LJ trait, and identify J as the ortholog of Arabidopsis thaliana EARLY FLOWERING 3 ( ELF3 ). J depends genetically on the legume-specific flowering repressor E1 , and J protein physically associates with the E1 promoter to downregulate its transcription, relieving repression of two important FLOWERING LOCUS T ( FT ) genes and promoting flowering under short days. Our findings identify an important new component in flowering-time control in soybean and provide new insight into soybean adaptation to tropical regions.
MACSE: Multiple Alignment of Coding SEquences Accounting for Frameshifts and Stop Codons
Until now the most efficient solution to align nucleotide sequences containing open reading frames was to use indirect procedures that align amino acid translation before reporting the inferred gap positions at the codon level. There are two important pitfalls with this approach. Firstly, any premature stop codon impedes using such a strategy. Secondly, each sequence is translated with the same reading frame from beginning to end, so that the presence of a single additional nucleotide leads to both aberrant translation and alignment.We present an algorithm that has the same space and time complexity as the classical Needleman-Wunsch algorithm while accommodating sequencing errors and other biological deviations from the coding frame. The resulting pairwise coding sequence alignment method was extended to a multiple sequence alignment (MSA) algorithm implemented in a program called MACSE (Multiple Alignment of Coding SEquences accounting for frameshifts and stop codons). MACSE is the first automatic solution to align protein-coding gene datasets containing non-functional sequences (pseudogenes) without disrupting the underlying codon structure. It has also proved useful in detecting undocumented frameshifts in public database sequences and in aligning next-generation sequencing reads/contigs against a reference coding sequence.MACSE is distributed as an open-source java file executable with freely available source code and can be used via a web interface at: http://mbb.univ-montp2.fr/macse.
Structural basis and functional analysis of the SARS coronavirus nsp14–nsp10 complex
Nonstructural protein 14 (nsp14) of coronaviruses (CoV) is important for viral replication and transcription. The N-terminal exoribonuclease (ExoN) domain plays a proofreading role for prevention of lethal mutagenesis, and the C-terminal domain functions as a (guanine-N7) methyl transferase (N7-MTase) for mRNA capping. The molecular basis of both these functions is unknown. Here, we describe crystal structures of severe acute respiratory syndrome (SARS)-CoV nsp14 in complex with its activator nonstructural protein10 (nsp10) and functional ligands. One molecule of nsp10 interacts with ExoN of nsp14 to stabilize it and stimulate its activity. Although the catalytic core of nsp14 ExoN is reminiscent of proofreading exonucleases, the presence of two zinc fingers sets it apart from homologs. Mutagenesis studies indicate that both these zinc fingers are essential for the function of nsp14. We show that a DEEDh (the five catalytic amino acids) motif drives nucleotide excision. The N7-MTase domain exhibits a noncanonical MTase fold with a rare β-sheet insertion and a peripheral zinc finger. The cap-precursor guanosine-P3-adenosine-5′,5′-triphosphate and S-adenosyl methionine bind in proximity in a highly constricted pocket between two β-sheets to accomplish methyl transfer. Our studies provide the first glimpses, to our knowledge, into the architecture of the nsp14–nsp10 complex involved in RNA viral proofreading.