Catalogue Search | MBRL

Impacts of the Cretaceous Terrestrial Revolution and KPg Extinction on Mammal Diversification

by Ingram, Colleen M. , Westerman, Michael , Meredith, Robert W. in Amino acids , Animals , Biological Evolution

2011

Previous analyses of relations, divergence times, and diversification patterns among extant mammalian families have relied on supertree methods and local molecular clocks. We constructed a molecular supermatrix for mammalian families and analyzed these data with likelihood-based methods and relaxed molecular clocks. Phylogenetic analyses resulted in a robust phylogeny with better resolution than phylogenies from supertree methods. Relaxed clock analyses support the long-fuse model of diversification and highlight the importance of including multiple fossil calibrations that are spread across the tree. Molecular time trees and diversification analyses suggest important roles for the Cretaceous Terrestrial Revolution and Cretaceous-Paleogene (KPg) mass extinction in opening up ecospace that promoted interordinal and intraordinal diversification, respectively. By contrast, diversification analyses provide no support for the hypothesis concerning the delayed rise of present-day mammals during the Eocene Period.

Journal Article

Share this book

Add to My Shelf

Fast algorithms for computing phylogenetic divergence time

by Williams, Tiffani L. , Crosby, Ralph W. in Algorithms , Animals , Bioinformatics

2017

Background The inference of species divergence time is a key step in most phylogenetic studies. Methods have been available for the last ten years to perform the inference, but the performance of the methods does not yet scale well to studies with hundreds of taxa and thousands of DNA base pairs. For example a study of 349 primate taxa was estimated to require over 9 months of processing time. In this work, we present a new algorithm, AncestralAge, that significantly improves the performance of the divergence time process. Results As part of AncestralAge, we demonstrate a new method for the computation of phylogenetic likelihood and our experiments show a 90% improvement in likelihood computation time on the aforementioned dataset of 349 primates taxa with over 60,000 DNA base pairs. Additionally, we show that our new method for the computation of the Bayesian prior on node ages reduces the running time for this computation on the 349 taxa dataset by 99%. Conclusion Through the use of these new algorithms we open up the ability to perform divergence time inference on large phylogenetic studies.

Journal Article

Share this book

Add to My Shelf

MrsRF: an efficient MapReduce algorithm for analyzing large collections of evolutionary trees

by Matthews, Suzanne J , Williams, Tiffani L in Algorithms , Bioinformatics , Biomedical and Life Sciences

2010

Background MapReduce is a parallel framework that has been used effectively to design large-scale parallel applications for large computing clusters. In this paper, we evaluate the viability of the MapReduce framework for designing phylogenetic applications. The problem of interest is generating the all-to-all Robinson-Foulds distance matrix, which has many applications for visualizing and clustering large collections of evolutionary trees. We introduce MrsRF ( MapReduce Speeds up RF ), a multi-core algorithm to generate a t × t Robinson-Foulds distance matrix between t trees using the MapReduce paradigm. Results We studied the performance of our MrsRF algorithm on two large biological trees sets consisting of 20,000 trees of 150 taxa each and 33,306 trees of 567 taxa each. Our experiments show that MrsRF is a scalable approach reaching a speedup of over 18 on 32 total cores. Our results also show that achieving top speedup on a multi-core cluster requires different cluster configurations. Finally, we show how to use an RF matrix to summarize collections of phylogenetic trees visually. Conclusion Our results show that MapReduce is a promising paradigm for developing multi-core phylogenetic applications. The results also demonstrate that different multi-core configurations must be tested in order to obtain optimum performance. We conclude that RF matrices play a critical role in developing techniques to summarize large collections of trees.

Journal Article

Share this book

Add to My Shelf

Tree House Explorer: A Novel Genome Browser for Phylogenomics

by Foley, Nicole M , Harris, Andrew J , Williams, Tiffani L in Analysis , Annotations , Chromosomes

2022

Abstract Tree House Explorer (THEx) is a genome browser that integrates phylogenomic data and genomic annotations into a single interactive platform for combined analysis. THEx allows users to visualize genome-wide variation in evolutionary histories and genetic divergence on a chromosome-by-chromosome basis, with continuous sliding window comparisons to gene annotations, recombination rates, and other user-specified, highly customizable feature annotations. THEx provides a new platform for interactive phylogenomic data visualization to analyze and interpret the diverse evolutionary histories woven throughout genomes. Hosted on Conda, THEx integrates seamlessly into new or pre-existing workflows.

Journal Article

Share this book

Add to My Shelf

An efficient and extensible approach for compressing phylogenetic trees

by Matthews, Suzanne J , Williams, Tiffani L in Algorithms , Animals , Bioinformatics

2011

Background Biologists require new algorithms to efficiently compress and store their large collections of phylogenetic trees. Our previous work showed that TreeZip is a promising approach for compressing phylogenetic trees. In this paper, we extend our TreeZip algorithm by handling trees with weighted branches. Furthermore, by using the compressed TreeZip file as input, we have designed an extensible decompressor that can extract subcollections of trees, compute majority and strict consensus trees, and merge tree collections using set operations such as union, intersection, and set difference. Results On unweighted phylogenetic trees, TreeZip is able to compress Newick files in excess of 98%. On weighted phylogenetic trees, TreeZip is able to compress a Newick file by at least 73%. TreeZip can be combined with 7zip with little overhead, allowing space savings in excess of 99% (unweighted) and 92%(weighted). Unlike TreeZip, 7zip is not immune to branch rotations, and performs worse as the level of variability in the Newick string representation increases. Finally, since the TreeZip compressed text (TRZ) file contains all the semantic information in a collection of trees, we can easily filter and decompress a subset of trees of interest (such as the set of unique trees), or build the resulting consensus tree in a matter of seconds. We also show the ease of which set operations can be performed on TRZ files, at speeds quicker than those performed on Newick or 7zip compressed Newick files, and without loss of space savings. Conclusions TreeZip is an efficient approach for compressing large collections of phylogenetic trees. The semantic and compact nature of the TRZ file allow it to be operated upon directly and quickly, without a need to decompress the original Newick file. We believe that TreeZip will be vital for compressing and archiving trees in the biological community.

Journal Article

Share this book

Add to My Shelf

A New Support Measure to Quantify the Impact of Local Optima in Phylogenetic Analyses

by Tiffani L. Williams , Grant Brammer , Seung-Jin Sul in Analysis , Cladistic analysis , Original Research

2011

Phylogentic analyses are often incorrectly assumed to have stabilized to a single optimum. However, a set of trees from a phylogenetic analysis may contain multiple distinct local optima with each optimum providing different levels of support for each clade. For situations with multiple local optima, we propose p-support which is a clade support measure that shows the impact optima have on a final consensus tree. Our p-support measure is implemented in our PeakMapper software package. We study our approach on two published, large-scale biological tree collections. PeakMapper shows that each data set contains multiple local optima, p-support shows that both datasets contain clades in the majority consensus tree that are only supported by a subset of the local optima. Clades with low p-support are most likely to benefit from further investigation. These tools provide researchers with new information regarding phylogenetic analyses beyond what is provided by other support measures alone.

Journal Article

Share this book

Add to My Shelf

Using tree diversity to compare phylogenetic heuristics

by Sul, Seung-Jin , Matthews, Suzanne , Williams, Tiffani L in Algorithms , Bioinformatics , Biological Evolution

2009

Background Evolutionary trees are family trees that represent the relationships between a group of organisms. Phylogenetic heuristics are used to search stochastically for the best-scoring trees in tree space. Given that better tree scores are believed to be better approximations of the true phylogeny, traditional evaluation techniques have used tree scores to determine the heuristics that find the best scores in the fastest time. We develop new techniques to evaluate phylogenetic heuristics based on both tree scores and topologies to compare Pauprat and Rec-I-DCM3, two popular Maximum Parsimony search algorithms. Results Our results show that although Pauprat and Rec-I-DCM3 find the trees with the same best scores, topologically these trees are quite different. Furthermore, the Rec-I-DCM3 trees cluster distinctly from the Pauprat trees. In addition to our heatmap visualizations of using parsimony scores and the Robinson-Foulds distance to compare best-scoring trees found by the two heuristics, we also develop entropy-based methods to show the diversity of the trees found. Overall, Pauprat identifies more diverse trees than Rec-I-DCM3. Conclusion Overall, our work shows that there is value to comparing heuristics beyond the parsimony scores that they find. Pauprat is a slower heuristic than Rec-I-DCM3. However, our work shows that there is tremendous value in using Pauprat to reconstruct trees—especially since it finds identical scoring but topologically distinct trees. Hence, instead of discounting Pauprat, effort should go in improving its implementation. Ultimately, improved performance measures lead to better phylogenetic heuristics and will result in better approximations of the true evolutionary history of the organisms of interest.

Journal Article

Share this book

Add to My Shelf

Tree House Explorer: A Novel Genome Browser for Phylogenomics

by Foley, Nicole M , Harris, Andrew J , Williams, Tiffani L in Bioinformatics , Chromosomes , Genetic divergence

2022

Tree House Explorer (THEx) is a genome browser that integrates phylogenomic data and genomic annotations into a single interactive platform for combined analysis. THEx allows users to visualize genome-wide variation in evolutionary histories and genetic divergence on a chromosome-by-chromosome basis, with continuous sliding window comparisons to gene annotations, recombination rates, and other user-specified, highly customizable feature annotations. THEx provides a new platform for interactive phylogenomic data visualization to analyze and interpret the diverse evolutionary histories woven throughout genomes. Hosted on Conda, THEx integrates seamlessly into new or pre-existing workflows. Competing Interest Statement The authors have declared no competing interest. Footnotes * https://github.com/harris-2374/THEx

Paper

Share this book

Add to My Shelf

An efficient and extensible approach for compressing phylogenetic trees

by Matthews, Suzanne J , Williams, Tiffani L in Proceedings

2011

Journal Article

Share this book

Add to My Shelf

A RANDOMIZED ALGORITHM FOR COMPARING SETS OF PHYLOGENETIC TREES

by WILLIAMS, TIFFANI L. , SUL, SEUNG-JIN in Contributed Papers

2007

Phylogenetic analysis often produce a large number of candidate evolutionary trees, each a hypothesis of the \"true\" tree. Post-processing techniques such as strict consensus trees are widely used to summarize the evolutionary relationships into a single tree. However, valuable information is lost during the summarization process. A more elementary step is to produce estimates of the topological differences that exist among all pairs of trees. We design a new randomized algorithm, called Hush-RF, that computes the all-to-all Robinson-Foulds (RF) distance—the most common distance metric for comparing two phylogenetic trees. Our approach uses a hash table to organize the bipartitions of a tree, and a universal hashing function makes our algorithm randomized. We compare the performance of our Hash-RF algorithm to PAUP*'s implementation of computing the all-to-all RF distance matrix. Our experiments focus on the algorithmic performance of comparing sets of biological trees, where the size of each tree ranged from 500 to 2,000 taxa and the collection of trees varied from 200 to 1,000 trees. Our experimental results clearly show that our Hash-RF algorithm is up to 500 times faster than PAUP*'s approach. Thus, Hash-RF provides an efficient alternative to a single tree summary of a collection of trees and potentially gives researchers the ability to explore their data in new and interesting ways.

Book Chapter

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter