Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
206
result(s) for
"Dopazo, Joaquin"
Sort by:
ETE: a python Environment for Tree Exploration
by
Huerta-Cepas, Jaime
,
Gabaldón, Toni
,
Dopazo, Joaquín
in
Algorithms
,
Bioinformatics
,
Biomedical and Life Sciences
2010
Background
Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale.
Results
Here we present the Environment for Tree Exploration (ETE), a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations.
Conclusions
ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from
http://ete.cgenomics.org
.
Journal Article
A Phylogenetic Analysis of 34 Chloroplast Genomes Elucidates the Relationships between Wild and Domestic Species within the Genus Citrus
by
Carbonell-Caballero, Jose
,
Terol, Javier
,
Alonso, Roberto
in
Chloroplast DNA
,
Chloroplasts
,
Cultivation
2015
Citrus genus includes some of the most important cultivated fruit trees worldwide. Despite being extensively studied because of its commercial relevance, the origin of cultivated citrus species and the history of its domestication still remain an open question. Here, we present a phylogenetic analysis of the chloroplast genomes of 34 citrus genotypes which constitutes the most comprehensive and detailed study to date on the evolution and variability of the genus Citrus. A statistical model was used to estimate divergence times between the major citrus groups. Additionally, a complete map of the variability across the genome of different citrus species was produced, including single nucleotide variants, heteroplasmic positions, indels (insertions and deletions), and large structural variants. The distribution of all these variants provided further independent support to the phylogeny obtained. An unexpected finding was the high level of heteroplasmy found in several of the analyzed genomes. The use of the complete chloroplast DNA not only paves the way for a better understanding of the phylogenetic relationships within the Citrus genus but also provides original insights into other elusive evolutionary processes, such as chloroplast inheritance, heteroplasmy, and gene selection.
Journal Article
Towards a metagenomics machine learning interpretable model for understanding the transition from adenoma to colorectal cancer
by
Casimiro-Soriguer, Carlos S.
,
Peña-Chilet, María
,
Dopazo, Joaquin
in
631/114
,
631/208
,
631/326
2022
Gut microbiome is gaining interest because of its links with several diseases, including colorectal cancer (CRC), as well as the possibility of being used to obtain non-intrusive predictive disease biomarkers. Here we performed a meta-analysis of 1042 fecal metagenomic samples from seven publicly available studies. We used an interpretable machine learning approach based on functional profiles, instead of the conventional taxonomic profiles, to produce a highly accurate predictor of CRC with better precision than those of previous proposals. Moreover, this approach is also able to discriminate samples with adenoma, which makes this approach very promising for CRC prevention by detecting early stages in which intervention is easier and more effective. In addition, interpretable machine learning methods allow extracting features relevant for the classification, which reveals basic molecular mechanisms accounting for the changes undergone by the microbiome functional landscape in the transition from healthy gut to adenoma and CRC conditions. Functional profiles have demonstrated superior accuracy in predicting CRC and adenoma conditions than taxonomic profiles and additionally, in a context of explainable machine learning, provide useful hints on the molecular mechanisms operating in the microbiota behind these conditions.
Journal Article
Multidimensional Gene Set Analysis of Genomic Data
2010
Understanding the functional implications of changes in gene expression, mutations, etc., is the aim of most genomic experiments. To achieve this, several functional profiling methods have been proposed. Such methods study the behaviour of different gene modules (e.g. gene ontology terms) in response to one particular variable (e.g. differential gene expression). In spite to the wealth of information provided by functional profiling methods, a common limitation to all of them is their inherent unidimensional nature. In order to overcome this restriction we present a multidimensional logistic model that allows studying the relationship of gene modules with different genome-scale measurements (e.g. differential expression, genotyping association, methylation, copy number alterations, heterozygosity, etc.) simultaneously. Moreover, the relationship of such functional modules with the interactions among the variables can also be studied, which produces novel results impossible to be derived from the conventional unidimensional functional profiling methods. We report sound results of gene sets associations that remained undetected by the conventional one-dimensional gene set analysis in several examples. Our findings demonstrate the potential of the proposed approach for the discovery of new cell functionalities with complex dependences on more than one variable.
Journal Article
Genomics of the origin and evolution of Citrus
2018
The genus
Citrus
, comprising some of the most widely cultivated fruit crops worldwide, includes an uncertain number of species. Here we describe ten natural citrus species, using genomic, phylogenetic and biogeographic analyses of 60 accessions representing diverse citrus germ plasms, and propose that citrus diversified during the late Miocene epoch through a rapid southeast Asian radiation that correlates with a marked weakening of the monsoons. A second radiation enabled by migration across the Wallace line gave rise to the Australian limes in the early Pliocene epoch. Further identification and analyses of hybrids and admixed genomes provides insights into the genealogy of major commercial cultivars of citrus. Among mandarins and sweet orange, we find an extensive network of relatedness that illuminates the domestication of these groups. Widespread pummelo admixture among these mandarins and its correlation with fruit size and acidity suggests a plausible role of pummelo introgression in the selection of palatable mandarins. This work provides a new evolutionary framework for the genus
Citrus
.
The origin, evolution and domestication of
Citrus
and the genealogy of the most important wild and cultivated citrus varieties.
When life gave us lemons
Citrus fruits are one of the most cultivated crops worldwide, yet the evolutionary relationships among citrus species remain uncertain. Daniel Rokhsar, Manuel Talon and colleagues analyse the genomes of 60 accessions that represent a diverse range of citrus species, including 30 newly sequenced citrus genomes. They characterize the diversity and evolution of citrus at the species level and identify interspecific citrus hybrids and admixtures—genetic mixing between previously isolated populations—that could be the result of human activities such as migration and agriculture. The authors identify 10 progenitor species and suggest that citrus originated in southeast Asia, diversifying during the late Miocene epoch through a rapid southeast Asian radiation that correlated with a changing climate, including the weakening of the monsoons. They also find extensive relatedness among mandarins and sweet oranges, showing a complex history of admixture during the domestication of these groups.
Journal Article
DOME: recommendations for supervised machine learning validation in biology
by
Garcia-Gasulla, Dario
,
Del Conte Alessio
,
Capella-Gutierrez, Salvador
in
Domes
,
Learning algorithms
,
Machine learning
2021
DOME is a set of community-wide recommendations for reporting supervised machine learning–based analyses applied to biological studies. Broad adoption of these recommendations will help improve machine learning assessment and reproducibility.
Journal Article
Exploring the druggable space around the Fanconi anemia pathway using machine learning and mechanistic models
by
Esteban-Medina, Marina
,
Dopazo, Joaquín
,
Peña-Chilet, María
in
Algorithms
,
Anemia
,
Artificial intelligence
2019
Background
In spite of the abundance of genomic data, predictive models that describe phenotypes as a function of gene expression or mutations are difficult to obtain because they are affected by the curse of dimensionality, given the disbalance between samples and candidate genes. And this is especially dramatic in scenarios in which the availability of samples is difficult, such as the case of rare diseases.
Results
The application of multi-output regression machine learning methodologies to predict the potential effect of external proteins over the signaling circuits that trigger Fanconi anemia related cell functionalities, inferred with a mechanistic model, allowed us to detect over 20 potential therapeutic targets.
Conclusions
The use of artificial intelligence methods for the prediction of potentially causal relationships between proteins of interest and cell activities related with disease-related phenotypes opens promising avenues for the systematic search of new targets in rare diseases.
Journal Article
Real world evidence of calcifediol or vitamin D prescription and mortality rate of COVID-19 in a retrospective cohort of hospitalized Andalusian patients
2021
COVID-19 is a major worldwide health problem because of acute respiratory distress syndrome, and mortality. Several lines of evidence have suggested a relationship between the vitamin D endocrine system and severity of COVID-19. We present a survival study on a retrospective cohort of 15,968 patients, comprising all COVID-19 patients hospitalized in Andalusia between January and November 2020. Based on a central registry of electronic health records (the Andalusian Population Health Database, BPS), prescription of vitamin D or its metabolites within 15–30 days before hospitalization were recorded. The effect of prescription of vitamin D (metabolites) for other indication previous to the hospitalization was studied with respect to patient survival. Kaplan–Meier survival curves and hazard ratios support an association between prescription of these metabolites and patient survival. Such association was stronger for calcifediol (Hazard Ratio, HR = 0.67, with 95% confidence interval, CI, of [0.50–0.91]) than for cholecalciferol (HR = 0.75, with 95% CI of [0.61–0.91]), when prescribed 15 days prior hospitalization. Although the relation is maintained, there is a general decrease of this effect when a longer period of 30 days prior hospitalization is considered (calcifediol HR = 0.73, with 95% CI [0.57–0.95] and cholecalciferol HR = 0.88, with 95% CI [0.75, 1.03]), suggesting that association was stronger when the prescription was closer to the hospitalization.
Journal Article
The effects of death and post-mortem cold ischemia on human tissue transcriptomes
2018
Post-mortem tissues samples are a key resource for investigating patterns of gene expression. However, the processes triggered by death and the post-mortem interval (PMI) can significantly alter physiologically normal RNA levels. We investigate the impact of PMI on gene expression using data from multiple tissues of post-mortem donors obtained from the GTEx project. We find that many genes change expression over relatively short PMIs in a tissue-specific manner, but this potentially confounding effect in a biological analysis can be minimized by taking into account appropriate covariates. By comparing ante- and post-mortem blood samples, we identify the cascade of transcriptional events triggered by death of the organism. These events do not appear to simply reflect stochastic variation resulting from mRNA degradation, but active and ongoing regulation of transcription. Finally, we develop a model to predict the time since death from the analysis of the transcriptome of a few readily accessible tissues.
RNA levels in post-mortem tissue can differ greatly from those before death. Studying the effect of post-mortem interval on the transcriptome in 36 human tissues, Ferreira et al. find that the response to death is largely tissue-specific and develop a model to predict time since death based on RNA data.
Journal Article
A Pan-Cancer Catalogue of Cancer Driver Protein Interaction Interfaces
2015
Despite their importance in maintaining the integrity of all cellular pathways, the role of mutations on protein-protein interaction (PPI) interfaces as cancer drivers has not been systematically studied. Here we analyzed the mutation patterns of the PPI interfaces from 10,028 proteins in a pan-cancer cohort of 5,989 tumors from 23 projects of The Cancer Genome Atlas (TCGA) to find interfaces enriched in somatic missense mutations. To that end we use e-Driver, an algorithm to analyze the mutation distribution of specific protein functional regions. We identified 103 PPI interfaces enriched in somatic cancer mutations. 32 of these interfaces are found in proteins coded by known cancer driver genes. The remaining 71 interfaces are found in proteins that have not been previously identified as cancer drivers even that, in most cases, there is an extensive literature suggesting they play an important role in cancer. Finally, we integrate these findings with clinical information to show how tumors apparently driven by the same gene have different behaviors, including patient outcomes, depending on which specific interfaces are mutated.
Journal Article