Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
123
result(s) for
"Alignment-free"
Sort by:
Improved metagenomic analysis with Kraken 2
by
Lu, Jennifer
,
Wood, Derrick E.
,
Langmead, Ben
in
Algorithms
,
Alignment-free methods
,
Animal Genetics and Genomics
2019
Although Kraken’s
k
-mer-based approach provides a fast taxonomic classification of metagenomic sequence data, its large memory requirements can be limiting for some applications. Kraken 2 improves upon Kraken 1 by reducing memory usage by 85%, allowing greater amounts of reference genomic data to be used, while maintaining high accuracy and increasing speed fivefold. Kraken 2 also introduces a translated search mode, providing increased sensitivity in viral metagenomics analysis.
Journal Article
Syncmers are more sensitive than minimizers for selecting conserved k ‑mers in biological sequences
2021
Minimizers are widely used to select subsets of fixed-length substrings ( k -mers) from biological sequences in applications ranging from read mapping to taxonomy prediction and indexing of large datasets. The minimizer of a string of w consecutive k -mers is the k -mer with smallest value according to an ordering of all k -mers. Syncmers are defined here as a family of alternative methods which select k -mers by inspecting the position of the smallest-valued substring of length s < k within the k -mer. For example, a closed syncmer is selected if its smallest s -mer is at the start or end of the k -mer. At least one closed syncmer must be found in every window of length ( k − s ) k -mers. Unlike a minimizer, a syncmer is identified by its sequence alone, and is therefore synchronized in the following sense: if a given k -mer is selected from one sequence, it will also be selected from any other sequence. Also, minimizers can be deleted by mutations in flanking sequence, which cannot happen with syncmers. Experiments on minimizers with parameters used in the minimap2 read mapper and Kraken taxonomy prediction algorithm respectively show that syncmers can simultaneously achieve both lower density and higher conservation compared to minimizers.
Journal Article
Alignment-Free Wireless Charging of Smart Garments with Embroidered Coils
by
Riehl, Patrick
,
Lin, Jenshan
,
Chang, Chin-Wei
in
alignment-free wireless charging
,
Design
,
Digitization
2021
Wireless power transfer (WPT) technologies have been adopted by many products. The capability of charging multiple devices and the design flexibility of charging coils make WPT a good solution for charging smart garments. The use of an embroidered receiver (RX) coil makes the smart garment more breathable and comfortable than using a flexible printed circuit board (FPCB). In order to charge smart garments as part of normal daily routines, two types of wireless-charging systems operating at 400 kHz have been designed. The one-to-one hanger system is desired to have a constant charging current despite misalignment so that users do not need to pay much attention when they hang the garment. For the one-to-multiple-drawer system, the power delivery ability must not change with multiple garments. Additionally, the system should be able to charge folded garments in most of the folding scenarios. This paper analyses the two WPT systems for charging smart garments and provides design approaches to meet the abovementioned goals. The wireless-charging hanger is able to charge a smart garment over a coupling variance kmaxkmin=2 with only 21% charging current variation. The wireless-charging drawer is able to charge a smart garment with at least 20 mA under most folding scenarios and three garments with stable power delivery ability.
Journal Article
A genome Tree of Life for the Fungi kingdom
2017
Fungi belong to one of the largest and most diverse kingdoms of living organisms. The evolutionary kinship within a fungal population has so far been inferred mostly from the gene-information– based trees (“gene trees”), constructed commonly based on the degree of differences of proteins or DNA sequences of a small number of highly conserved genes common among the population by a multiple sequence alignment (MSA) method. Since each gene evolves under different evolutionary pressure and time scale, it has been known that one gene tree for a population may differ from other gene trees for the same population depending on the subjective selection of the genes. Within the last decade, a large number of whole-genome sequences of fungi have become publicly available, which represent, at present, the most fundamental and complete information about each fungal organism. This presents an opportunity to infer kinship among fungi using a whole-genome information-based tree (“genome tree”). The method we used allows comparison of whole-genome information without MSA, and is a variation of a computational algorithm developed to find semantic similarities or plagiarism in two books, where we represent whole-genomic information of an organism as a book of words without spaces. The genome tree reveals several significant and notable differences from the gene trees, and these differences invoke new discussions about alternative narratives for the evolution of some of the currently accepted fungal groups.
Journal Article
Benchmarking of alignment-free sequence comparison methods
by
Zielezinski, Andrzej
,
Bernard, Guillaume
,
Kim, Sung-Hou
in
Algorithms
,
Alignment-free
,
Amino acid sequence
2019
Background
Alignment-free (AF) sequence comparison is attracting persistent interest driven by data-intensive applications. Hence, many AF procedures have been proposed in recent years, but a lack of a clearly defined benchmarking consensus hampers their performance assessment.
Results
Here, we present a community resource (
http://afproject.org
) to establish standards for comparing alignment-free approaches across different areas of sequence-based research. We characterize 74 AF methods available in 24 software tools for five research applications, namely, protein sequence classification, gene tree inference, regulatory element detection, genome-based phylogenetic inference, and reconstruction of species trees under horizontal gene transfer and recombination events.
Conclusion
The interactive web service allows researchers to explore the performance of alignment-free tools relevant to their data types and analytical goals. It also allows method developers to assess their own algorithms and compare them with current state-of-the-art tools, accelerating the development of new, more accurate AF solutions.
Journal Article
Reference-Free Variant Calling with Local Graph Construction with ska lo (SKA)
by
Rodríguez-Bouza, Víctor
,
Lalvani, Ajit
,
Madon, Kieran
in
Algorithms
,
Polymorphism, Single Nucleotide
,
Software
2025
The study of genomic variants is increasingly important for public health surveillance of pathogens. Traditional variant-calling methods from whole-genome sequencing data rely on reference-based alignment, which can introduce biases and require significant computational resources. Alignment- and reference-free approaches offer an alternative by leveraging k-mer-based methods, but existing implementations often suffer from sensitivity limitations, particularly in high mutation density genomic regions. Here, we present ska lo, a graph-based algorithm that aims to identify within-strain variants in pathogen whole-genome sequencing data by traversing a colored De Bruijn graph and building variant groups (i.e. sets of variant combinations). Through in silico benchmarking and real-world dataset analyses, we demonstrate that ska lo achieves high sensitivity in single-nucleotide polymorphism (SNP) calls while also enabling the detection of insertions and deletions, as well as SNP positioning on a reference genome for recombination analyses. These findings highlight ska lo as a simple, fast, and effective tool for pathogen genomic epidemiology, extending the range of reference-free variant-calling approaches. ska lo is freely available as part of the SKA program (https://github.com/bacpop/ska.rust).
Journal Article
Skmer: assembly-free and alignment-free sample identification using genome skims
by
Bafna, Vineet
,
Sarmashghi, Shahab
,
P. Gilbert, M. Thomas
in
Accuracy
,
Alignment-free
,
Animal Genetics and Genomics
2019
The ability to inexpensively describe taxonomic diversity is critical in this era of rapid climate and biodiversity changes. The recent genome-skimming approach extends current barcoding practices beyond short markers by applying low-pass sequencing and recovering whole organelle genomes computationally. This approach discards the nuclear DNA, which constitutes the vast majority of the data. In contrast, we suggest using all unassembled reads. We introduce an assembly-free and alignment-free tool, Skmer, to compute genomic distances between the query and reference genome skims. Skmer shows excellent accuracy in estimating distances and identifying the closest match in reference datasets.
Journal Article
MeShClust v3.0: high-quality clustering of DNA sequences using the mean shift algorithm and alignment-free identity scores
2022
Background
Tools for accurately clustering biological sequences are among the most important tools in computational biology. Two pioneering tools for clustering sequences are
CD-HIT
and
UCLUST
, both of which are fast and consume reasonable amounts of memory; however, there is a big room for improvement in terms of cluster quality. Motivated by this opportunity for improving cluster quality, we applied the mean shift algorithm in
MeShClust v1.0
. The mean shift algorithm is an instance of unsupervised learning. Its strong theoretical foundation guarantees the convergence to the true cluster centers. Our implementation of the mean shift algorithm in
MeShClust v1.0
was a step forward. In this work, we scale up the algorithm by adapting an out-of-core strategy while utilizing alignment-free identity scores in a new tool:
MeShClust v3.0
.
Results
We evaluated
CD-HIT
,
MeShClust v1.0
,
MeShClust v3.0
, and
UCLUST
on 22 synthetic sets and five real sets. These data sets were designed or selected for testing the tools in terms of scalability and different similarity levels among sequences comprising clusters. On the synthetic data sets,
MeShClust v3.0
outperformed the related tools on all sets in terms of cluster quality. On two real data sets obtained from human microbiome and maize transposons,
MeShClust v3.0
outperformed the related tools by wide margins, achieving 55%–300% improvement in cluster quality. On another set that includes degenerate viral sequences,
MeShClust v3.0
came third. On two bacterial sets,
MeShClust v3.0
was the only applicable tool because of the long sequences in these sets.
MeShClust v3.0
requires more time and memory than the related tools; almost all personal computers at the time of this writing can accommodate such requirements.
MeShClust v3.0
can estimate an important parameter that controls cluster membership with high accuracy.
Conclusions
These results demonstrate the high quality of clusters produced by
MeShClust v3.0
and its ability to apply the mean shift algorithm to large data sets and long sequences. Because clustering tools are utilized in many studies, providing high-quality clusters will help with deriving accurate biological knowledge.
Journal Article
An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data
by
Cannon, Charles H.
,
Fan, Huan
,
Ives, Anthony R.
in
Algorithms
,
Analysis
,
Animal Genetics and Genomics
2015
Background
Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because
de novo
assembly for non-model genomes and multi-genome alignment are challenging.
Results
To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (
https://sourceforge.net/projects/aaf-phylogeny
) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms.
Conclusion
Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.
Journal Article
Hierarchical Interleaved Bloom Filter: enabling ultrafast, approximate sequence queries
by
Seiler, Enrico
,
Droop, Felix
,
Vingron, Martin
in
Algorithms
,
Alignment free analysis
,
Animal Genetics and Genomics
2023
We present a novel data structure for searching sequences in large databases: the Hierarchical Interleaved Bloom Filter (HIBF). It is extremely fast and space efficient, yet so general that it could serve as the underlying engine for many applications. We show that the HIBF is superior in build time, index size, and search time while achieving a comparable or better accuracy compared to other state-of-the-art tools. The HIBF builds an index up to 211 times faster, using up to 14 times less space, and can answer approximate membership queries faster by a factor of up to 129.
Journal Article