Asset Details

MbrlCatalogueTitleDetail

Do you wish to reserve the book?

Improvement in Protein Domain Identification Is Reached by Breaking Consensus, with the Agreement of Many Profiles and Domain Co-occurrence

by Carbone, Alessandra , Zaverucha, Gerson , Bernardes, Juliana , Vaquero, Catherine

in Amino Acid Sequence / Annotations / Biology and Life Sciences / Computational Biology / Computer and Information Sciences / Computer Science / Consensus Sequence / Databases, Protein / Datasets / Decision making / Experiments / Genomes / High-throughput screening (Biochemical assaying) / Human health and pathology / Identification / Identification and classification / Infectious diseases / Life Sciences / Mathematical models / Methods / Optimization techniques / Phylogenetics / Plasmodium falciparum / Plasmodium falciparum - genetics / Plasmodium falciparum - metabolism / Protein Domains / Proteins / Proteins - chemistry / Proteins - genetics / Proteins - metabolism / Protozoan Proteins - chemistry / Protozoan Proteins - metabolism / Research and Analysis Methods / Sequence Alignment - methods / Sequence Analysis, Protein - methods / Stochastic models

2016

Yes Please

Hey, we have placed the reservation for you!

By the way, why not check out events that you can attend while you pick your title.

Oops! Something went wrong.

Looks like we were not able to place the reservation. Kindly try again later.

Are you sure you want to remove the book from the shelf?

Improvement in Protein Domain Identification Is Reached by Breaking Consensus, with the Agreement of Many Profiles and Domain Co-occurrence

by Carbone, Alessandra , Zaverucha, Gerson , Bernardes, Juliana , Vaquero, Catherine

2016

Confirm

Do you wish to request the book?

Improvement in Protein Domain Identification Is Reached by Breaking Consensus, with the Agreement of Many Profiles and Domain Co-occurrence

by Carbone, Alessandra , Zaverucha, Gerson , Bernardes, Juliana , Vaquero, Catherine

2016

Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy

How would you like to get it?

Submit

We have requested the book for you!

Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.

Oops! Something went wrong.

Looks like we were not able to place your request. Kindly try again later.

Journal Article

Improvement in Protein Domain Identification Is Reached by Breaking Consensus, with the Agreement of Many Profiles and Domain Co-occurrence

Carbone, Alessandra,

Zaverucha, Gerson,

Bernardes, Juliana,

Vaquero, Catherine

2016

Overview

Traditional protein annotation methods describe known domains with probabilistic models representing consensus among homologous domain sequences. However, when relevant signals become too weak to be identified by a global consensus, attempts for annotation fail. Here we address the fundamental question of domain identification for highly divergent proteins. By using high performance computing, we demonstrate that the limits of state-of-the-art annotation methods can be bypassed. We design a new strategy based on the observation that many structural and functional protein constraints are not globally conserved through all species but might be locally conserved in separate clades. We propose a novel exploitation of the large amount of data available: 1. for each known protein domain, several probabilistic clade-centered models are constructed from a large and differentiated panel of homologous sequences, 2. a decision-making protocol combines outcomes obtained from multiple models, 3. a multi-criteria optimization algorithm finds the most likely protein architecture. The method is evaluated for domain and architecture prediction over several datasets and statistical testing hypotheses. Its performance is compared against HMMScan and HHblits, two widely used search methods based on sequence-profile and profile-profile comparison. Due to their closeness to actual protein sequences, clade-centered models are shown to be more specific and functionally predictive than the broadly used consensus models. Based on them, we improved annotation of Plasmodium falciparum protein sequences on a scale not previously possible. We successfully predict at least one domain for 72% of P. falciparum proteins against 63% achieved previously, corresponding to 30% of improvement over the total number of Pfam domain predictions on the whole genome. The method is applicable to any genome and opens new avenues to tackle evolutionary questions such as the reconstruction of ancient domain duplications, the reconstruction of the history of protein architectures, and the estimation of protein domain age. Website and software: http://www.lcqb.upmc.fr/CLADE.

Share this book

Add to My Shelf

Publisher

Public Library of Science,PLOS,Public Library of Science (PLoS)

Subject

Amino Acid Sequence

/ Annotations

/ Biology and Life Sciences

/ Computational Biology

/ Computer and Information Sciences

/ Computer Science

/ Consensus Sequence