Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
35
result(s) for
"deep coalescence"
Sort by:
Patterns and causes of incongruence between plastid and nuclear Senecioneae (Asteraceae) phylogenies
by
Nordenstam, Bertil
,
Tepe, Eric J.
,
Kadereit, Joachim W.
in
ancient hybridization
,
Asteraceae
,
Biological evolution
2010
One of the longstanding questions in phylogenetic systematics is how to address incongruence among phylogenies obtained from multiple markers and how to determine the causes. This study presents a detailed analysis of incongruent patterns between plastid and ITS/ETS phylogenies of Tribe Senecioneae (Asteraceae). This approach revealed widespread and strongly supported incongruence, which complicates conclusions about evolutionary relationships at all taxonomic levels. The patterns of incongruence that were resolved suggest that incomplete lineage sorting (ILS) and/or ancient hybridization are the most likely explanations. These phenomena are, however, extremely difficult to distinguish because they may result in similar phylogenetic patterns. We present a novel approach to evaluate whether ILS can be excluded as an explanation for incongruent patterns. This coalescence-based method uses molecular dating estimates of the duration of the putative ILS events to determine if invoking ILS as an explanation for incongruence would require unrealistically high effective population sizes. For four of the incongruent patterns identified within the Senecioneae, this approach indicates that ILS cannot be invoked to explain the observed incongruence. Alternatively, these patterns are more realistically explained by ancient hybridization events.
Journal Article
Species Tree Discordance Traces to Phylogeographic Clade Boundaries in North American Fence Lizards (Sceloporus)
2009
I investigated the impacts of phylogeographic sampling decisions on species tree estimation in the Sceloporus undulatus species group, a recent radiation of small, insectivorous lizards connected by parapatric and peripatric distribution across North America, using a variety of species tree inference methods (Bayesian estimation of species trees, Bayesian untangling of concordance knots, and minimize deep coalescences). Phylogenetic analyses of 16 specimens representing 4 putative species within S. “undulatus” using complete (8 loci, >5.5 kb) and incomplete (29 loci, >23.6 kb) nuclear data sets result in species trees that share features with the mitochondrial DNA (mtDNA) genealogy at the phylogeographic level but provide new insights into the evolutionary history of the species group. The concatenated nuclear data and mtDNA data both recover 4 major clades connecting populations across North America; however, instances of discordance are localized at the contact zones between adjacent phylogeographic groups. A random sub-sampling experiment designed to vary the phylogeographic samples included across hundreds of replicate species tree inferences suggests that inaccurate species assignments can result in inferred phylogenetic relationships that are dependent upon which particular populations are used as exemplars to represent species and can lead to increased estimates of effective population size (θ). For the phylogeographic data presented here, reassigning specimens with introgressed mtDNA genomes to their prospective species, or excluding them from the analysis altogether, produces species tree topologies that are distinctly different from analyses that utilize mtDNA-based species assignments. Evolutionary biologists working at the interface of phylogeography and phylogenetics are likely to encounter multiple processes influencing gene trees congruence, which increases the relevance of estimating species trees with multilocus nuclear data and models that accommodate deep coalescence.
Journal Article
Assessing Approaches for Inferring Species Trees from Multi-Copy Genes
by
Burleigh, J. Gordon
,
Chaudhary, Ruchi
,
Boussau, Bastien
in
Biodiversity
,
Classification - methods
,
Coalescence
2015
With the availability of genomic sequence data, there is increasing interest in using genes with a possible history of duplication and loss for species tree inference. Here we assess the performance of both nonprobabilistic and probabilistic species tree inference approaches using gene duplication and loss and coalescence simulations. We evaluated the performance of gene tree parsimony (GTP) based on duplication (Only-dup), duplication and loss (Dup-loss), and deep coalescence (Deep-c) costs, the NJst distance method, the MulRF supertree method, and PHYLDOG, which jointly estimates gene trees and species tree using a hierarchical probabilistic model. We examined the effects of gene tree and species sampling, gene tree error, and duplication and loss rates on the accuracy of phylogenetic estimates. In the 10-taxon duplication and loss simulation experiments, MulRF is more accurate than the other methods when the duplication and loss rates are low, and Dup-loss is generally the most accurate when the duplication and loss rates are high. PHYLDOG performs well in 10-taxon duplication and loss simulations, but its run time is prohibitively long on larger data sets. In the larger duplication and loss simulation experiments, MulRF outperforms all other methods in experiments with at most 100 taxa; however, in the larger simulation, Dup-loss generally performs best. In all duplication and loss simulation experiments with more than 10 taxa, all methods perform better with more gene trees and fewer missing sequences, and they are all affected by gene tree error. Our results also highlight high levels of error in estimates of duplications and losses from GTP methods and demonstrate the usefulness of methods based on generic tree distances for large analyses.
Journal Article
Exact median-tree inference for unrooted reconciliation costs
by
Markin, Alexey
,
Górecki, Paweł
,
Eulenstein, Oliver
in
Costs
,
Dynamic programming
,
Exact solutions
2020
Background Solving median tree problems under tree reconciliation costs is a classic and well-studied approach for inferring species trees from collections of discordant gene trees. These problems are NP-hard, and therefore are, in practice, typically addressed by local search heuristics. So far, however, such heuristics lack any provable correctness or precision. Further, even for small phylogenetic studies, it has been demonstrated that local search heuristics may only provide sub-optimal solutions. Obviating such heuristic uncertainties are exact dynamic programming solutions that allow solving tree reconciliation problems for smaller phylogenetic studies. Despite these promises, such exact solutions are only suitable for credibly rooted input gene trees, which constitute only a tiny fraction of the readily available gene trees. Standard gene tree inference approaches provide only unrooted gene trees and accurately rooting such trees is often difficult, if not impossible. Results Here, we describe complex dynamic programming solutions that represent the first nonnaïve exact solutions for solving the tree reconciliation problems for unrooted input gene trees. Further, we show that the asymptotic runtime of the proposed solutions does not increase when compared to the most time-efficient dynamic programming solutions for rooted input trees. Conclusions In an experimental evaluation, we demonstrate that the described solutions for unrooted gene trees are, like the solutions for rooted input gene trees, suitable for smaller phylogenetic studies. Finally, for the first time, we study the accuracy of classic local search heuristics for unrooted tree reconciliation problems.
Journal Article
Embedding gene trees into phylogenetic networks by conflict resolution algorithms
by
Górecki, Paweł
,
Wawerka, Marcin
,
Dąbkowski, Dawid
in
Algorithms
,
Bioinformatics
,
Biomedical and Life Sciences
2022
Background
Phylogenetic networks are mathematical models of evolutionary processes involving reticulate events such as hybridization, recombination, or horizontal gene transfer. One of the crucial notions in phylogenetic network modelling is displayed tree, which is obtained from a network by removing a set of reticulation edges. Displayed trees may represent an evolutionary history of a gene family if the evolution is shaped by reticulation events.
Results
We address the problem of inferring an optimal tree displayed by a network, given a gene tree
G
and a tree-child network
N
, under the deep coalescence and duplication costs. We propose an
O
(
mn
)-time dynamic programming algorithm (DP) to compute a lower bound of the optimal displayed tree cost, where
m
and
n
are the sizes of
G
and
N
, respectively. In addition, our algorithm can verify whether the solution is exact. Moreover, it provides a set of reticulation edges corresponding to the obtained cost. If the cost is exact, the set induces an optimal displayed tree. Otherwise, the set contains pairs of conflicting edges, i.e., edges sharing a reticulation node. Next, we show a conflict resolution algorithm that requires
2
r
+
1
-
1
invocations of DP in the worst case, where
r
is the number of reticulations. We propose a similar
O
(
2
k
m
n
)
-time algorithm for level-
k
tree-child networks and a branch and bound solution to compute lower and upper bounds of optimal costs. We also extend the algorithms to a broader class of phylogenetic networks. Based on simulated data, the average runtime is
Θ
(
2
0.543
k
m
n
)
under the deep-coalescence cost and
Θ
(
2
0.355
k
m
n
)
under the duplication cost.
Conclusions
Despite exponential complexity in the worst case, our algorithms perform significantly well on empirical and simulated datasets, due to the strategy of resolving internal dissimilarities between gene trees and networks. Therefore, the algorithms are efficient alternatives to enumeration strategies commonly proposed in the literature and enable analyses of complex networks with dozens of reticulations.
Journal Article
Phylogenetic and Coalescent Strategies of Species Delimitation in Snubnose Darters (Percidae: Etheostoma)
2012
The rapid accumulation of multilocus data sets has led to dramatic advances in methodologies for estimating evolutionary relationships among closely related species, but relatively less advancement has been made in methods for discriminating between competing species delimitation hypotheses. Multilocus data sets provide an advantage in testing species delimitation scenarios because they offer a direct test of species monophyly and aid in the biological interpretation of such phenomena as allele-sharing and deep coalescent events. Most species tree estimation methods that are designed to analyze multilocus data sets require the a priori assignment of individuals to species categories and therefore do not provide a strategy to directly test competing species delimitation scenarios. An approach was recently proposed that utilizes a coalescent-based species tree estimation method to inform species delimitation decisions by comparing likelihood scores that measure the fit of gene trees within a given species tree. We use a multilocus nuclear and mitochondrial DNA sequence data set to both reexamine a recently proposed species delimitation scenario in the Etheostoma simoterum species complex and test the utility of species tree estimation methods in testing species delimitation hypotheses. Descriptions of species in the E. simoterum species complex of snubnose darters, a group of six teleost freshwater fish species, are based largely on male nuptial coloration. Most of the putative species are nonmonophyletic at every examined locus. Using a novel combination of Bayesian-estimated gene tree topologies, Bayesian phylogenetic species tree inferences, coalescent simulations, and examination of phenotypic variation, we assess the occurrence of shared alleles among species, and we propose that results from our analyses support a three-species rather than a six-species delimitation scenario in the E. simoterum complex. We found that comparing likelihood scores from the species tree estimation approach used across many potential delimitation scenarios resulted in a systematic bias toward over-splitting in the E. simoterum complex and failed to support a species delimitation scenario that was consistent with geography, phenotype, or any previous species delimitation hypothesis. Despite common expectations, we demonstrate that application of molecular approaches to species delimitation can result in the recognition of fewer, instead of a larger number of species. In addition, our analyses highlight the importance of phenotypic character information in providing an independent assessment of alternative species delimitation hypotheses in the E. simoterum species complex.
Journal Article
Gene tree parsimony for incomplete gene trees: addressing true biological loss
by
Bayzid, Md Shamsuzzoha
,
Warnow, Tandy
in
Algorithms
,
Bioinformatics
,
Biomedical and Life Sciences
2018
Motivation
Species tree estimation from gene trees can be complicated by gene duplication and loss, and “gene tree parsimony” (GTP) is one approach for estimating species trees from multiple gene trees. In its standard formulation, the objective is to find a species tree that minimizes the total number of gene duplications and losses with respect to the input set of gene trees. Although much is known about GTP, little is known about how to treat inputs containing some
incomplete gene trees
(i.e., gene trees lacking one or more of the species).
Results
We present new theory for GTP considering whether the incompleteness is due to gene birth and death (i.e., true biological loss) or taxon sampling, and present dynamic programming algorithms that can be used for an exact but exponential time solution for small numbers of taxa, or as a heuristic for larger numbers of taxa. We also prove that the “standard” calculations for duplications and losses exactly solve GTP when incompleteness results from taxon sampling, although they can be incorrect when incompleteness results from true biological loss. The software for the DP algorithm is freely available as open source code at
https://github.com/smirarab/DynaDup
.
Journal Article
Refining discordant gene trees
2014
Background
Evolutionary studies are complicated by discordance between gene trees and the species tree in which they evolved. Dealing with discordant trees often relies on comparison costs between gene and species trees, including the well-established Robinson-Foulds, gene duplication, and deep coalescence costs. While these costs have provided credible results for binary rooted gene trees, corresponding cost definitions for non-binary unrooted gene trees, which are frequently occurring in practice, are challenged by biological realism.
Result
We propose a natural extension of the well-established costs for comparing unrooted and non-binary gene trees with rooted binary species trees using a binary refinement model. For the duplication cost we describe an efficient algorithm that is based on a linear time reduction and also computes an optimal rooted binary refinement of the given gene tree. Finally, we show that similar reductions lead to solutions for computing the deep coalescence and the Robinson-Foulds costs.
Conclusion
Our binary refinement of Robinson-Foulds, gene duplication, and deep coalescence costs for unrooted and non-binary gene trees together with the linear time reductions provided here for computing these costs significantly extends the range of trees that can be incorporated into approaches dealing with discordance.
Journal Article
Identifying Hybridization Events in the Presence of Coalescence via Model Selection
As DNA sequences have become more readily available, it has become increasingly desirable to infer species phylogenies from multigene data sets. Much recent work has centered around the recognition that substantial incongruence in single-gene phylogenies necessitates the development of statistical procedures to estimate species phylogenies that appropriately model the process of evolution at the level of the individual genes. One process that gives rise to variation in the histories of individual genes is incomplete lineage sorting, which is commonly modeled by the coalescent, and thus much current work is focused on proper estimation of species phylogenies under the coalescent model. A second common source of discord in single-gene phylogenies is hybridization, a process that is ubiquitous in many groups of plants and animals. Although methods to incorporate hybridization into phylogenetic estimation have also been developed, only a handful of methods that address both coalescence and hybridization have been proposed. Here, I propose an extension of an existing model that incorporates both of these processes simultaneously by utilizing gene trees for inference in a likelihood framework. The model allows examination of the evidence for hybridization in the presence of incomplete lineage sorting due to deep coalescence via model selection using standard information criteria (e.g., Akaike information criterion and Bayesian information criterion). The potential of the method is evaluated using simulated data.
Journal Article
IDXL: Species Tree Inference Using Internode Distance and Excess Gene Leaf Count
by
Mukherjee, Jayanta
,
Bhattacharyya, Sourya
in
Algorithms
,
Animal Genetics and Genomics
,
Animals
2017
We propose an extension of the distance matrix methods NJst and ASTRID to infer species trees from incongruent gene trees having Incomplete Lineage Sorting. Both approaches consider the average internode distance (ID) between individual taxa pairs as the distance measure. The measure ID does not use the root of a tree, and thus may not always infer the relative position of a taxon with respect to the root. We define a novel distance measure
excess gene leaf count
(XL) between individual couplets. The XL measure is computed using the root of a tree. It is proved to be
additive
, and is shown to infer the relative order of divergence among individual couplets better. We propose a novel method
IDXL
which uses both the XL and ID measures for species tree construction. IDXL is shown to perform better than NJst and other distance matrix approaches for most of the biological and simulated datasets. Having the same computational complexity as NJst, IDXL can be applied for species tree inference on large-scale biological datasets.
Journal Article