Catalogue Search | MBRL

Finding the Number of Spanning Trees in Specific Graph Sequences Generated by a Johnson Skeleton Graph

by Daoud, Salama Nagy , Asiri, Ahmad in Analysis , Difference equations , electrically equivalent transformations

2025

Using equivalent transformations, complicated circuits in physics that need numerous mathematical operations to analyze can be broken down into simpler equivalent circuits. It is also possible to determine the number of spanning trees—graph families in particular—using these adjustments and utilizing our knowledge of difference equations, electrically equivalent transformations, and weighted generating function rules. In this paper, we derive the exact formulas for the number of spanning trees of sequences of new graph families created by a Johnson skeleton graph 63 and a few of its related graphs. Lastly, a comparison is made between our graphs’ entropy and other graphs of average degree four.

Journal Article

Share this book

Add to My Shelf

Paragraph: a graph-based structural variant genotyper for short-read sequence data

by Chen, Sai , Sedlazeck, Fritz J. , Krusche, Peter in Accuracy , Algorithms , ancestry

2019

Accurate detection and genotyping of structural variations (SVs) from short-read data is a long-standing area of development in genomics research and clinical sequencing pipelines. We introduce Paragraph, an accurate genotyper that models SVs using sequence graphs and SV annotations. We demonstrate the accuracy of Paragraph on whole-genome sequence data from three samples using long-read SV calls as the truth set, and then apply Paragraph at scale to a cohort of 100 short-read sequenced samples of diverse ancestry. Our analysis shows that Paragraph has better accuracy than other existing genotypers and can be applied to population-scale studies.

Journal Article

Share this book

Add to My Shelf

LCSkPOA: enabling banded semi-global partial order alignments via efficient and accurate backbone generation through extended LCSk

by Weerakoon, Minindu , Saunders, Christopher T. , Heaton, Haynes in Algorithms , Bioinformatics , Biomedical and Life Sciences

2025

Background Most multiple sequence alignment and string-graph alignment algorithms focus on global alignment, but many applications exist for semi-global and local string-graph alignment. Long reads require enormous amounts of memory and runtime to fill out large dynamic programming tables. Effective algorithms for finding the backbone and thus defining a band of an alignment such as the longest common subsequence with kmer matches (LCSk++) exist but do not work with graphs. This study introduces an adaptation of the Longest Common Subsequence with kmer matches (LCSk++) algorithm tailored for graph structures, particularly focusing on Partial Order Alignment (POA) graphs. POA graphs, which are directed acyclic graphs, represent multiple sequence alignments and effectively capture the relationships between sequences. State-of-the-art methods like ABPOA and SPOA improve upon POA, while ABPOA incorporates banding, SPOA does not; however, neither utilizes parallel processing despite leveraging SIMD for faster matrix calculations. Our approach addresses these limitations by extending the LCSk++ algorithm to handle the complexities of graph-based alignment while incorporating SIMD, banding, and parallel processing for enhanced efficiency. Results Our extended LCSk++ algorithm integrates dynamic programming and graph traversal techniques to detect conserved regions within POA graphs, termed the LCSk++ backbone. This backbone enables precise banding of the POA matrix for all alignment modes (global, semi-global, and local). Unlike ABPOA, which only allows banded global alignment, our approach enables broader flexibility and significantly improves consensus sequence construction. While supporting more alignment modes than ABPOA, it also outperforms SPOA’s global alignment, with substantial memory savings (up to 98%) and significant run-time reductions (up to 25x), particularly for long sequences (> 30,000 bp). Our method maintains high alignment accuracy and proves effective across various string lengths and datasets, including synthetic and PacBio HiFi reads. Parallel processing further enhances runtime efficiency, achieving up to 150x speed improvements on conventional PCs. Conclusion The extended LCSk++ algorithm for graph structures offers a substantial advancement in sequence alignment technology. It effectively reduces memory consumption and optimizes run times without compromising alignment quality, thus providing a robust solution for all alignment modes (global, local, and semi-global) in POA. This method enhances the utility of POA in critical applications such as multiple sequence alignment for phylogeny construction and graph-based reference alignment.

Journal Article

Share this book

Add to My Shelf

SGDAN—A Spatio-Temporal Graph Dual-Attention Neural Network for Quantified Flight Delay Prediction

by Pan, Li , Bian, Lei , Liu, Shijun in air traffic data , attention , flight delay

2020

There has been a lot of research on flight delays. But it is more useful and difficult to estimate the departure delay time especially three hours before the scheduled time of departure, from which passengers can reasonably plan their travel time and the airline and airport staff can schedule flights more reasonably. In this paper, we develop a Spatio-temporal Graph Dual-Attention Neural Network (SGDAN) to learn the departure delay time for each flight with real-time conditions at three hours before the scheduled time of departure. Specifically, it first models the air traffic network as graph sequences, what is, using a heterogeneous graph to model a flight and its adjacent flights with the same departure or arrival airport in a special time interval, and using a sequence to model the flight and its previous flights that share the same aircraft. The main contributions of this paper are using heterogeneous graph-level attention to learn the influence between the flight and its adjacent flight together with sequence-level attention to learn the influence between the flight and its previous flight in the flight sequence. With aggregating features from the learned influence from both graph-level and sequence-level attention, SGDAN can generate node embedding to estimate the departure delay time. Experiments on a real-world large-scale data set show that SGDAN produces better results than state-of-the-art models in the accurate flight delay time estimation task.

Journal Article

Share this book

Add to My Shelf

Read mapping on de Bruijn graphs

by Limasset, Antoine , Cazaux, Bastien , Rivals, Eric in Algorithms , Bioinformatics , Biomedical and Life Sciences

2016

Background Next Generation Sequencing (NGS) has dramatically enhanced our ability to sequence genomes, but not to assemble them. In practice, many published genome sequences remain in the state of a large set of contigs. Each contig describes the sequence found along some path of the assembly graph, however, the set of contigs does not record all the sequence information contained in that graph. Although many subsequent analyses can be performed with the set of contigs, one may ask whether mapping reads on the contigs is as informative as mapping them on the paths of the assembly graph. Currently, one lacks practical tools to perform mapping on such graphs. Results Here, we propose a formal definition of mapping on a de Bruijn graph, analyse the problem complexity which turns out to be NP-complete, and provide a practical solution. We propose a pipeline called GGMAP (Greedy Graph MAPping). Its novelty is a procedure to map reads on branching paths of the graph, for which we designed a heuristic algorithm called BGREAT (de Bruijn Graph REAd mapping Tool). For the sake of efficiency, BGREAT rewrites a read sequence as a succession of unitigs sequences. GGMAP can map millions of reads per CPU hour on a de Bruijn graph built from a large set of human genomic reads. Surprisingly, results show that up to 22 % more reads can be mapped on the graph but not on the contig set. Conclusions Although mapping reads on a de Bruijn graph is complex task, our proposal offers a practical solution combining efficiency with an improved mapping capacity compared to assembly-based mapping even for complex eukaryotic data.

Journal Article

Share this book

Add to My Shelf

Phased epigenomics and methylation inheritance in a historical Vitis vinifera hybrid

by Cochetel, Noé , Liou, Joel , Vondras, Amanda M. in Animal Genetics and Genomics , asexual reproduction , Bioinformatics

2025

Background Epigenetic modifications, such as DNA methylation, regulate transcription and influence key biological traits. While many efforts were made to understand their stability in annual crops, their long-term persistence in clonally propagated plants remains poorly understood. Grapevine ( Vitis vinifera ) provides a unique model, with cultivars vegetatively propagated for centuries. Results Here, we assemble the phased genomes of Cabernet Sauvignon and its parental lineages, Cabernet Franc and Sauvignon Blanc, using HiFi long-reads and a gene map tenfold denser than existing maps. Using three clones per cultivar, we quantify methylation with very consistent short- and long-read sequencing and ensure both varietal representativeness and assessment of clonal variability. We leverage the parent-progeny sequence graph to highlight allele-specific methylation and conserved transcriptomic patterns for genes and small RNA. Such a format is essential to integrate multi-omics data and reveals that, despite less clonal conservation than genetic polymorphisms, methylation marks are remarkably inherited. By further demonstrating the linear-reference limitations, we determine that the correct representation of genetic variants by the sequence graph is crucial for the accurate allelic quantification of the methylome. Conclusions These findings reveal the remarkable stability of epigenetic marks in a model propagated by asexual reproduction. Using a phased sequence graph, we introduce a scalable framework that accounts for genomic variation, accurately quantifies allele-specific methylation, and supports multi-omics integration such as our evaluation of the transcriptional impact of epigenetic inheritance. This approach has broad implications for perennial crops, where epigenetic variation could influence traits relevant to breeding, adaptation, and long-term agricultural sustainability.

Journal Article

Share this book

Add to My Shelf

Restriction of the Global IgM Repertoire in Antiphospholipid Syndrome

by Pashova, Shina , Shivarov, Velizar , Pashov, Anastas in Amino acids , Antibodies , Antibodies, Antiphospholipid

2022

The typical anti-phospholipid antibodies (APLA) in the anti-phospholipid syndrome (APS) are reactive with the phospholipid-binding protein β2GPI as well as a growing list of other protein targets. The relation of APLA to natural antibodies and the fuzzy set of autoantigens involved provoked us to study the changes in the IgM repertoire in APS. To this end, peptides selected by serum IgM from a 7-residue linear peptide phage display library (PDL) were deep sequenced. The analysis was aided by a novel formal representation of the Igome (the mimotope set reflecting the IgM specificities) in the form of a sequence graph. The study involved women with APLA and habitual abortions (n=24) compared to age-matched clinically healthy pregnant women (n=20). Their pooled Igomes (297 028 mimotope sequences) were compared also to the global public repertoire Igome of pooled donor plasma IgM (n=2 796 484) and a set of 7-mer sequences found in the J regions of human immunoglobulins (n=4 433 252). The pooled Igome was represented as a graph connecting the sequences as similar as the mimotopes of the same monoclonal antibody. The criterion was based on previously published data. In the resulting graph, identifiable clusters of vertices were considered related to the footprints of overlapping antibody cross-reactivities. A subgraph based on the clusters with a significant differential expression of APS patients’ mimotopes contained predominantly specificities underrepresented in APS. The differentially expressed IgM footprints showed also an increased cross-reactivity with immunoglobulin J regions. The specificities underexpressed in APS had a higher correlation with public specificities than those overexpressed. The APS associated specificities were strongly related also to the human peptidome with 1 072 mimotope sequences found in 7 519 human proteins. These regions were characterized by low complexity. Thus, the IgM repertoire of the APS patients was found to be characterized by a significant reduction of certain public specificities found in the healthy controls with targets representing low complexity linear self-epitopes homologous to human antibody J regions.

Journal Article

Share this book

Add to My Shelf

A Study On The Number Of Edges Of Some Families Of Graphs And Generalized Mersenne Numbers

by Ramesh, Kumar P , Sreekumar, K G , Manilal, K in Balancing , Computers , Cryptography

2022

The relationship between the Nandu sequence of the SM family of graphs and the generalized Mersenne numbers is demonstrated in this study. The sequences obtained from the peculiar number of edges of SM family of graphs are known as Nandu sequences. Nandu sequences are related to the two families of SM sum graphs and SM balancing graphs. The SM sum graphs are established from the inherent relationship between powers of 2 and natural numbers, whereas the SM balancing graphs are linked to the balanced ternary number system. In addition, some unusual prime numbers are discovered in this paper. These prime numbers best suit as an alternate for the Mersenne primes in the case of the public key cryptosystem.

Journal Article

Share this book

Add to My Shelf

Pebble Traversal-Based Fault Detection and Advanced Reconfiguration Technique for Digital Microfluidic Biochips

by Majumder, Mukta , Saha, Basudev , Shukla, Vineeta in Air monitoring , Air quality , Analyzers

2024

Digital Microfluidic Biochips (DMFBs) are rapidly replacing conventional biomedical analyzers by incorporating diverse bioassay operations with better throughput and precision at a negligible cost. In the last decade, these microfluidic devices have been well anticipated in miscellaneous healthcare applications such as DNA sequencing, drug discovery, drug screening, clinical diagnosis, etc., and other safety-critical fields like air quality monitoring, food safety testing, etc. In view of the application areas, these devices must incorporate the attributes like reliability, accuracy, and robustness. The correctness of a microfluidic device must be ensured through a superior testing technique before it is accepted for use in various applications. In this paper, an optimized fault modelling strategy to detect multiple faults in a digital microfluidic biochip has been introduced by embedding clockwise and anticlockwise movements of droplets using Pebble Traversal (based on Pebble Motion of Graph Theory). The suggested method also calculates traversal time for a fault-free biochip. In addition, this work presents an Advanced Module Sequence Graph-based reconfiguration technique to reinstate the microfluidic device for regular bioassays.

Journal Article

Share this book

Add to My Shelf

Pan-genome de Bruijn graph using the bidirectional FM-index

by Renders, Luca , Abeel, Thomas , Depuydt, Lore in Algorithms , Approximate pattern matching , Bioinformatics

2023

Background Pan-genome graphs are gaining importance in the field of bioinformatics as data structures to represent and jointly analyze multiple genomes. Compacted de Bruijn graphs are inherently suited for this purpose, as their graph topology naturally reveals similarity and divergence within the pan-genome. Most state-of-the-art pan-genome graphs are represented explicitly in terms of nodes and edges. Recently, an alternative, implicit graph representation was proposed that builds directly upon the unidirectional FM-index. As such, a memory-efficient graph data structure is obtained that inherits the FM-index’ backward search functionality. However, this representation suffers from a number of shortcomings in terms of functionality and algorithmic performance. Results We present a data structure for a pan-genome, compacted de Bruijn graph that aims to address these shortcomings. It is built on the bidirectional FM-index, extending the ability of its unidirectional counterpart to navigate and search the graph in both directions. All basic graph navigation steps can be performed in constant time. Based on these features, we implement subgraph visualization as well as lossless approximate pattern matching to the graph using search schemes. We demonstrate that we can retrieve all occurrences corresponding to a read within a certain edit distance in a very efficient manner. Through a case study, we show the potential of exploiting the information embedded in the graph’s topology through visualization and sequence alignment. Conclusions We propose a memory-efficient representation of the pan-genome graph that supports subgraph visualization and lossless approximate pattern matching of reads against the graph using search schemes. The C++ source code of our software, called Nexus, is available at https://github.com/biointec/nexus under AGPL-3.0 license.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter