Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
31
result(s) for
"Orthology inference"
Sort by:
SHOOT: phylogenetic gene search and ortholog inference
2022
Determining the evolutionary relationships between genes is fundamental to comparative biological research. Here, we present SHOOT. SHOOT searches a user query sequence against a database of phylogenetic trees and returns a tree with the query sequence correctly placed within it. We show that SHOOT performs this analysis with comparable speed to a BLAST search. We demonstrate that SHOOT phylogenetic placements are as accurate as conventional tree inference, and it can identify orthologs with high accuracy. In summary, SHOOT is a fast and accurate tool for phylogenetic analyses of novel query sequences. It is available online at
www.shoot.bio
.
Journal Article
SonicParanoid2: fast, accurate, and comprehensive orthology inference with machine learning and language models
by
Sriswasdi, Sira
,
Cosentino, Salvatore
,
Iwasaki, Wataru
in
Algorithms
,
Animal Genetics and Genomics
,
Bioinformatics
2024
Accurate inference of orthologous genes constitutes a prerequisite for comparative and evolutionary genomics. SonicParanoid is one of the fastest tools for orthology inference; however, its scalability and accuracy have been hampered by time-consuming all-versus-all alignments and the existence of proteins with complex domain architectures. Here, we present a substantial update of SonicParanoid, where a gradient boosting predictor halves the execution time and a language model doubles the recall. Application to empirical large-scale and standardized benchmark datasets shows that SonicParanoid2 is much faster than comparable methods and also the most accurate. SonicParanoid2 is available at
https://gitlab.com/salvo981/sonicparanoid2
and
https://zenodo.org/doi/10.5281/zenodo.11371108
.
Journal Article
Orthology Clusters from Gene Trees with Possvm
2021
Abstract
Possvm (Phylogenetic Ortholog Sorting with Species oVerlap and MCL [Markov clustering algorithm]) is a tool that automates the process of identifying clusters of orthologous genes from precomputed phylogenetic trees and classifying gene families. It identifies orthology relationships between genes using the species overlap algorithm to infer taxonomic information from the gene tree topology, and then uses the MCL to identify orthology clusters and provide annotated gene families. Our benchmarking shows that this approach, when provided with accurate phylogenies, is able to identify manually curated orthogroups with very high precision and recall. Overall, Possvm automates the routine process of gene tree inspection and annotation in a highly interpretable manner, and provides reusable outputs and phylogeny-aware gene annotations that can be used to inform comparative genomics and gene family evolution analyses.
Journal Article
Different orthology inference algorithms generate similar predicted orthogroups among Brassicaceae species
by
Liao, Irene T.
,
Nikolov, Lachezar A.
,
Hileman, Lena C.
in
Algorithms
,
Application
,
Brassicaceae
2025
Premise Orthology inference is crucial for comparative genomics, and multiple algorithms have been developed to identify putative orthologs for downstream analyses. Despite the abundance of proposed solutions, including publicly available benchmarks, it is difficult to assess which tool is most suitable for plant species, which commonly have complex genomic histories. Methods We explored the performance of four orthology inference algorithms—OrthoFinder, SonicParanoid, Broccoli, and OrthNet—on eight Brassicaceae genomes in two groups: one group comprising only diploids and another set comprising the diploids, two mesopolyploids, and one recent hexaploid genome. Results The composition of the orthogroups reflected the species' ploidy and genomic histories, with the diploid set having a higher proportion of identical orthogroups. While the diploid + higher ploidy set had a lower proportion of orthogroups with identical compositions, the average degree of similarity between the orthogroups was not different from the diploid set. Discussion Three algorithms—OrthoFinder, SonicParanoid, and Broccoli—are helpful for initial orthology predictions. Results produced using OrthNet were generally outliers but could still provide detailed information about gene colinearity. With our Brassicaceae dataset, slight discrepancies were found across the orthology inference algorithms, necessitating additional analyses such as tree inference to fine‐tune results.
Journal Article
Advancing phylogenomics in Amaranthaceae sensu stricto: Development and application of a new nuclear target enrichment bait set
by
Kadereit, Gudrun
,
Kiedaisch, Tina
,
Žerdoner Čalasan, Anže
in
Amaranthaceae
,
Bosea
,
Economic importance
2025
Premise Current phylogenies of Amaranthaceae sensu stricto (s.s.) are inadequately sampled and resolved to reflect the entire evolutionary history of the lineage, which is likely complex due to at least three whole‐genome duplication events, occasionally followed by subsequent additional polyploidization events and rapid diversification of individual sublineages. We designed a new target enrichment bait set to overcome these challenges when reconstructing a phylogeny and demonstrated its applicability to the entire Amaranthaceae s.s. lineage. Methods We analyzed 12,775 orthologous and low‐copy genes from a previous comprehensive transcriptomic study for marker selection. Following a newly developed approach that allows the selection of long exons and thus avoids the assembly of chimeric loci, we selected 1000 orthologous exons for phylogenomic analyses. Results Our in vivo application showed a high locus recovery rate across all major clades of Amaranthaceae s.s., generated a robust phylogenetic tree, and clarified previously ambiguous relationships of the genera Bosea and Charpentiera. Gene tree conflict analysis revealed mainly high levels of gene tree concordance within the lineage, with a few notable exceptions. Discussion The Amaranthaceae1000 kit will provide the basis for a phylogenetic tree across the Amaranthaceae s.s., facilitating future studies on systematics, diversification, and genome evolution within this economically important lineage.
Journal Article
OrthoFinder: phylogenetic orthology inference for comparative genomics
2019
Here, we present a major advance of the OrthoFinder method. This extends OrthoFinder’s high accuracy orthogroup inference to provide phylogenetic inference of orthologs, rooted gene trees, gene duplication events, the rooted species tree, and comparative genomics statistics. Each output is benchmarked on appropriate real or simulated datasets, and where comparable methods exist, OrthoFinder is equivalent to or outperforms these methods. Furthermore, OrthoFinder is the most accurate ortholog inference method on the Quest for Orthologs benchmark test. Finally, OrthoFinder’s comprehensive phylogenetic analysis is achieved with equivalent speed and scalability to the fastest, score-based heuristic methods. OrthoFinder is available at
https://github.com/davidemms/OrthoFinder
.
Journal Article
ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy
2020
Phylogenetic inference from genome-wide data (phylogenomics) has revolutionized the study of evolution because it enables accounting for discordance among evolutionary histories across the genome. To this end, summary methods have been developed to allow accurate and scalable inference of species trees from gene trees. However, most of these methods, including the widely used ASTRAL, can only handle single-copy gene trees and do not attempt to model gene duplication and gene loss. As a result, most phylogenomic studies have focused on single-copy genes and have discarded large parts of the data. Here, we first propose a measure of quartet similarity between single-copy and multicopy trees that accounts for orthology and paralogy. We then introduce a method called ASTRAL-Pro (ASTRAL for PaRalogs and Orthologs) to find the species tree that optimizes our quartet similarity measure using dynamic programing. By studying its performance on an extensive collection of simulated data sets and on real data sets, we show that ASTRAL-Pro is more accurate than alternative methods.
Journal Article
Comprehensive Species Sampling and Sophisticated Algorithmic Approaches Refute the Monophyly of Arachnida
by
Ballesteros, Jesús A
,
Benavides, Ligia R
,
Wheeler, Ward C
in
Animals
,
Arachnida
,
Arachnida - genetics
2022
Abstract
Deciphering the evolutionary relationships of Chelicerata (arachnids, horseshoe crabs, and allied taxa) has proven notoriously difficult, due to their ancient rapid radiation and the incidence of elevated evolutionary rates in several lineages. Although conflicting hypotheses prevail in morphological and molecular data sets alike, the monophyly of Arachnida is nearly universally accepted, despite historical lack of support in molecular data sets. Some phylotranscriptomic analyses have recovered arachnid monophyly, but these did not sample all living orders, whereas analyses including all orders have failed to recover Arachnida. To understand this conflict, we assembled a data set of 506 high-quality genomes and transcriptomes, sampling all living orders of Chelicerata with high occupancy and rigorous approaches to orthology inference. Our analyses consistently recovered the nested placement of horseshoe crabs within a paraphyletic Arachnida. This result was insensitive to variation in evolutionary rates of genes, complexity of the substitution models, and alternative algorithmic approaches to species tree inference. Investigation of sources of systematic bias showed that genes and sites that recover arachnid monophyly are enriched in noise and exhibit low information content. To test the impact of morphological data, we generated a 514-taxon morphological data matrix of extant and fossil Chelicerata, analyzed in tandem with the molecular matrix. Combined analyses recovered the clade Merostomata (the marine orders Xiphosura, Eurypterida, and Chasmataspidida), but merostomates appeared nested within Arachnida. Our results suggest that morphological convergence resulting from adaptations to life in terrestrial habitats has driven the historical perception of arachnid monophyly, paralleling the history of numerous other invertebrate terrestrial groups.
Journal Article
Orthology Inference in Nonmodel Organisms Using Transcriptomes and Low-Coverage Genomes: Improving Accuracy and Matrix Occupancy for Phylogenomics
2014
Orthology inference is central to phylogenomic analyses. Phylogenomic data sets commonly include transcriptomes and low-coverage genomes that are incomplete and contain errors and isoforms. These properties can severely violate the underlying assumptions of orthology inference with existing heuristics. We present a procedure that uses phylogenies for both homology and orthology assignment. The procedure first uses similarity scores to infer putative homologs that are then aligned, constructed into phylogenies, and pruned of spurious branches caused by deep paralogs, misassembly, frameshifts, or recombination. These final homologs are then used to identify orthologs. We explore four alternative tree-based orthology inference approaches, of which two are new. These accommodate gene and genome duplications as well as gene tree discordance. We demonstrate these methods in three published data sets including the grape family, Hymenoptera, and millipedes with divergence times ranging from approximately 100 to over 400 Ma. The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs. We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs. To explicitly evaluate sources of conflicting phylogenetic signals, we applied serial jackknife analyses of gene regions keeping each locus intact. The methods described here can scale to over 100 taxa. They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines. All scripts are available from https://bitbucket.org/yangya/phylogenomic_dataset_construction.
Journal Article