Catalogue Search | MBRL

Resolving repeat families with long reads

by Bongartz, Philipp in Algorithms , Assemblies , Bioinformatics

2019

Background Draft quality genomes for a multitude of organisms have become common due to the advancement of genome assemblers using long-read technologies with high error rates. Although current assemblies are substantially more contiguous than assemblies based on short reads, complete chromosomal assemblies are still challenging. Interspersed repeat families with multiple copy versions dominate the contig and scaffold ends of current long-read assemblies for complex genomes. These repeat families generally remain unresolved, as existing algorithmic solutions either do not scale to large copy numbers or can not handle the current high read error rates. Results We propose novel repeat resolution methods for large interspersed repeat families and assess their accuracy on simulated data sets with various distinct repeat structures and on drosophila melanogaster transposons. Additionally, we compare our methods to an existing long read repeat resolution tool and show the improved accuracy of our method. Conclusions Our results demonstrate the applicability of our methods for the improvement of the contiguity of genome assemblies.

Journal Article

Share this book

Add to My Shelf

RResolver: efficient short-read repeat resolution within ABySS

by Wong, Johnathan , Coombe, Lauren , Birol, Inanç in Algorithms , Assembly , Bioinformatics

2022

Background De novo genome assembly is essential to modern genomics studies. As it is not biased by a reference, it is also a useful method for studying genomes with high variation, such as cancer genomes. De novo short-read assemblers commonly use de Bruijn graphs, where nodes are sequences of equal length k , also known as k-mers. Edges in this graph are established between nodes that overlap by k - 1 bases, and nodes along unambiguous walks in the graph are subsequently merged. The selection of k is influenced by multiple factors, and optimizing this value results in a trade-off between graph connectivity and sequence contiguity. Ideally, multiple k sizes should be used, so lower values can provide good connectivity in lesser covered regions and higher values can increase contiguity in well-covered regions. However, current approaches that use multiple k values do not address the scalability issues inherent to the assembly of large genomes. Results Here we present RResolver, a scalable algorithm that takes a short-read de Bruijn graph assembly with a starting k as input and uses a k value closer to that of the read length to resolve repeats. RResolver builds a Bloom filter of sequencing reads which is used to evaluate the assembly graph path support at branching points and removes paths with insufficient support. RResolver runs efficiently, taking only 26 min on average for an ABySS human assembly with 48 threads and 60 GiB memory. Across all experiments, compared to a baseline assembly, RResolver improves scaffold contiguity (NGA50) by up to 15% and reduces misassemblies by up to 12%. Conclusions RResolver adds a missing component to scalable de Bruijn graph genome assembly. By improving the initial and fundamental graph traversal outcome, all downstream ABySS algorithms greatly benefit by working with a more accurate and less complex representation of the genome. The RResolver code is integrated into ABySS and is available at https://github.com/bcgsc/abyss/tree/master/RResolver .

Journal Article

Share this book

Add to My Shelf

SpLitteR: diploid genome assembly using TELL-Seq linked-reads and assembly graphs

by Pevzner, Pavel , Tolstoganov, Ivan , Korobeynikov, Anton in Animals , Assembly graph , Bar codes

2024

Recent advances in long-read sequencing technologies enabled accurate and contiguous assemblies of large genomes and metagenomes. However, even long and accurate high-fidelity (HiFi) reads do not resolve repeats that are longer than the read lengths. This limitation negatively affects the contiguity of diploid genome assemblies since two haplomes share many long identical regions. To generate the telomere-to-telomere assemblies of diploid genomes, biologists now construct their HiFi-based phased assemblies and use additional experimental technologies to transform them into more contiguous diploid assemblies. The barcoded linked-reads, generated using an inexpensive TELL-Seq technology, provide an attractive way to bridge unresolved repeats in phased assemblies of diploid genomes. We developed the SpLitteR tool for diploid genome assembly using linked-reads and assembly graphs and benchmarked it against state-of-the-art linked-read scaffolders ARKS and SLR-superscaffolder using human HG002 genome and sheep gut microbiome datasets. The benchmark showed that SpLitteR scaffolding results in 1.5-fold increase in NGA50 compared to the baseline LJA assembly and other scaffolders while introducing no additional misassemblies on the human dataset. We developed the SpLitteR tool for assembly graph phasing and scaffolding using barcoded linked-reads. We benchmarked SpLitteR on assembly graphs produced by various long-read assemblers and have demonstrated that TELL-Seq reads facilitate phasing and scaffolding in these graphs. This benchmarking demonstrates that SpLitteR improves upon the state-of-the-art linked-read scaffolders in the accuracy and contiguity metrics. SpLitteR is implemented in C++ as a part of the freely available SPAdes package and is available at https://github.com/ablab/spades/releases/tag/splitter-preprint.

Journal Article

Share this book

Add to My Shelf

DACCOR–Detection, characterization, and reconstruction of repetitive regions in bacterial genomes

by Nieselt, Kay , Seitz, Alexander , Hanssen, Friederike in Algorithms , Analysis , Bacterial genetics

2018

The reconstruction of genomes using mapping-based approaches with short reads experiences difficulties when resolving repetitive regions. These repetitive regions in genomes result in low mapping qualities of the respective reads, which in turn lead to many unresolved bases. Currently, the reconstruction of these regions is often based on modified references in which the repetitive regions are masked. However, for many references, such masked genomes are not available or are based on repetitive regions of other genomes. Our idea is to identify repetitive regions in the reference genome de novo. These regions can then be used to reconstruct them separately using short read sequencing data. Afterward, the reconstructed repetitive sequence can be inserted into the reconstructed genome. We present the program detection, characterization, and reconstruction of repetitive regions, which performs these steps automatically. Our results show an increased base pair resolution of the repetitive regions in the reconstruction of Treponema pallidum samples, resulting in fewer unresolved bases.

Journal Article

Share this book

Add to My Shelf

A CPC-shelterin-BTR axis regulates mitotic telomere deprotection

by Lamm, Noa , Hayashi, Makoto T. , Romero-Zamora, Diana in 13/1 , 13/106 , 13/89

2025

Telomeres prevent ATM activation by sequestering chromosome termini within telomere loops (t-loops). Mitotic arrest promotes telomere linearity and a localized ATM-dependent telomere DNA damage response (DDR) through an unknown mechanism. Using unbiased interactomics, biochemical screening, molecular biology, and super-resolution imaging, we found that mitotic arrest-dependent (MAD) telomere deprotection requires the combined activities of the Chromosome passenger complex (CPC) on shelterin, and the BLM-TOP3A-RMI1/2 (BTR) complex on t-loops. During mitotic arrest, the CPC component Aurora Kinase B (AURKB) phosphorylated both the TRF1 hinge and TRF2 basic domains. Phosphorylation of the TRF1 hinge domain enhances CPC and TRF1 interaction through the CPC Survivin subunit. Meanwhile, phosphorylation of the TRF2 basic domain promotes telomere linearity, activates a telomere DDR dependent on BTR-mediated double Holliday junction dissolution, and leads to mitotic death. We identify that the TRF2 basic domain functions in mitosis-specific telomere protection and reveal a regulatory role for TRF1 in controlling a physiological ATM-dependent telomere DDR. The data demonstrate that MAD telomere deprotection is a sophisticated active mechanism that exposes telomere ends to signal mitotic stress. Here the authors reveal how telomeres signal mitotic stress. A key protein network alters their structure exposing telomere ends to signal mitotic stress, ultimately triggering a controlled DNA damage response to remove faulty cells.

Journal Article

Share this book

Add to My Shelf

Local enrichment of HP1alpha at telomeres alters their structure and regulation of telomere protection

by Chow, Tracy T. , Wei, Jen-Hsuan , Huang, Bo in 13/1 , 13/106 , 13/109

2018

Enhanced telomere maintenance is evident in malignant cancers. While telomeres are thought to be inherently heterochromatic, detailed mechanisms of how epigenetic modifications impact telomere protection and structures are largely unknown in human cancers. Here we develop a molecular tethering approach to experimentally enrich heterochromatin protein HP1α specifically at telomeres. This results in increased deposition of H3K9me3 at cancer cell telomeres. Telomere extension by telomerase is attenuated, and damage-induced foci at telomeres are reduced, indicating augmentation of telomere stability. Super-resolution STORM imaging shows an unexpected increase in irregularity of telomeric structure. Telomere-tethered chromo shadow domain (CSD) mutant I165A of HP1α abrogates both the inhibition of telomere extension and the irregularity of telomeric structure, suggesting the involvement of at least one HP1α-ligand in mediating these effects. This work presents an approach to specifically manipulate the epigenetic status locally at telomeres to uncover insights into molecular mechanisms underlying telomere structural dynamics. Chromatin dynamics is thought to play an important role in the maintenance of telomeres, yet how has remained poorly understood. Here the authors locally enrich heterochromatin protein 1α (HP1α) at human telomeres to provide insights into the crosstalk between epigenetic regulations and structural dynamics at the telomeres.

Journal Article

Share this book

Add to My Shelf

Islands of retroelements are major components of Drosophila centromeres

by Erceg, Jelena , Chavan, Ankita , Chang, Ching-Ho in Animals , Assembly , Bioinformatics

2019

Centromeres are essential chromosomal regions that mediate kinetochore assembly and spindle attachments during cell division. Despite their functional conservation, centromeres are among the most rapidly evolving genomic regions and can shape karyotype evolution and speciation across taxa. Although significant progress has been made in identifying centromere-associated proteins, the highly repetitive centromeres of metazoans have been refractory to DNA sequencing and assembly, leaving large gaps in our understanding of their functional organization and evolution. Here, we identify the sequence composition and organization of the centromeres of Drosophila melanogaster by combining long-read sequencing, chromatin immunoprecipitation for the centromeric histone CENP-A, and high-resolution chromatin fiber imaging. Contrary to previous models that heralded satellite repeats as the major functional components, we demonstrate that functional centromeres form on islands of complex DNA sequences enriched in retroelements that are flanked by large arrays of satellite repeats. Each centromere displays distinct size and arrangement of its DNA elements but is similar in composition overall. We discover that a specific retroelement, G2/Jockey-3, is the most highly enriched sequence in CENP-A chromatin and is the only element shared among all centromeres. G2/Jockey-3 is also associated with CENP-A in the sister species D. simulans, revealing an unexpected conservation despite the reported turnover of centromeric satellite DNA. Our work reveals the DNA sequence identity of the active centromeres of a premier model organism and implicates retroelements as conserved features of centromeric DNA.

Journal Article

Share this book

Add to My Shelf

Longitudinal MRI and 1H-MRS study of SCA7 mouse forebrain reveals progressive multiregional atrophy and early brain metabolite changes indicating early neuronal and glial dysfunction

by Trottier, Yvon , Brouillet, Emmanuel , Pérot, Jean-Baptiste in Anterior commissure , Ataxia , Atrophy

2024

SpinoCerebellar Ataxia type 7 (SCA7) is an inherited disorder caused by CAG triplet repeats encoding polyglutamine expansion in the ATXN7 protein, which is part of the transcriptional coactivator complex SAGA. The mutation primarily causes neurodegeneration in the cerebellum and retina, as well as several forebrain structures. The SCA7 140Q/5Q knock-in mouse model recapitulates key disease features, including loss of vision and motor performance. To characterize the temporal progression of brain degeneration of this model, we performed a longitudinal study spanning from early to late symptomatic stages using high-resolution magnetic resonance imaging (MRI) and in vivo 1 H-magnetic resonance spectroscopy ( 1 H-MRS). Compared to wild-type mouse littermates, MRI analysis of SCA7 mice shows progressive atrophy of defined brain structures, with the striatum, thalamus and cortex being the first and most severely affected. The volume loss of these structures coincided with increased motor impairments in SCA7 mice, suggesting an alteration of the sensory-motor network, as observed in SCA7 patients. MRI also reveals atrophy of the hippocampus and anterior commissure at mid-symptomatic stage and the midbrain and brain stem at late stage. 1 H-MRS of hippocampus, a brain region previously shown to be dysfunctional in patients, reveals early and progressive metabolic alterations in SCA7 mice. Interestingly, abnormal glutamine accumulation precedes the hippocampal atrophy and the reduction in myo-inositol and total N-acetyl-aspartate concentrations, two markers of glial and neuronal damage, respectively. Together, our results indicate that non-cerebellar alterations and glial and neuronal metabolic impairments may play a crucial role in the development of SCA7 mouse pathology, particularly at early stages of the disease. Degenerative features of forebrain structures in SCA7 mice correspond to current observations made in patients. Our study thus provides potential biomarkers that could be used for the evaluation of future therapeutic trials using the SCA7 140Q/5Q model.

Journal Article

Share this book

Add to My Shelf

Anti-CRISPR-mediated control of gene editing and synthetic circuits in eukaryotic cells

by Xu, Xiaoshu , Chavez, Michael , Carter, Matthew A. in 13/106 , 13/31 , 14/63

2019

Repurposed CRISPR-Cas molecules provide a useful tool set for broad applications of genomic editing and regulation of gene expression in prokaryotes and eukaryotes. Recent discovery of phage-derived proteins, anti-CRISPRs, which serve to abrogate natural CRISPR anti-phage activity, potentially expands the ability to build synthetic CRISPR-mediated circuits. Here, we characterize a panel of anti-CRISPR molecules for expanded applications to counteract CRISPR-mediated gene activation and repression of reporter and endogenous genes in various cell types. We demonstrate that cells pre-engineered with anti-CRISPR molecules become resistant to gene editing, thus providing a means to generate “write-protected” cells that prevent future gene editing. We further show that anti-CRISPRs can be used to control CRISPR-based gene regulation circuits, including implementation of a pulse generator circuit in mammalian cells. Our work suggests that anti-CRISPR proteins should serve as widely applicable tools for synthetic systems regulating the behavior of eukaryotic cells. Anti-CRISPR proteins derived from phage can abrogate CRISPR activity. The authors repurpose these molecules for demonstrating genomic write-protection and pre-programmed gene expression circuits.

Journal Article

Share this book

Add to My Shelf

Transcriptome sequencing, de novo assembly, characterisation of wild accession of blackgram (Vigna mungo var. silvestris) as a rich resource for development of molecular markers and validation of SNPs by high resolution melting (HRM) analysis

by Souframanien, J. , Raizada, Avi in Agriculture , Asia , Biomedical and Life Sciences

2019

Background Blackgram [ Vigna mungo (L.) Hepper], is an important legume crop of Asia with limited genomic resources. We report a comprehensive set of genic simple sequence repeat (SSR) and single nucleotide polymorphism (SNPs) markers using Illumina MiSeq sequencing of transcriptome and its application in genetic variation analysis and mapping. Results Transcriptome sequencing of immature seeds of wild blackgram, V. mungo var . silvestris by Illumina MiSeq technology generated 1.9 × 10 7 reads, which were assembled into 40,178 transcripts (TCS) with an average length of 446 bp covering 2.97 GB of the genome. A total of 38,753 CDS (Coding sequences) were predicted from 40,178 TCS and 28,984 CDS were annotated through BLASTX and mapped to GO and KEGG database resulting in 140 unique pathways. The tri-nucleotides were most abundant (39.9%) followed by di-nucleotide (30.2%). About 60.3 and 37.6% of SSR motifs were present in the coding sequences (CDS) and untranslated regions (UTRs) respectively. Among SNPs, the most abundant substitution type were transitions (Ts) (61%) followed by transversions (Tv) type (39%), with a Ts/Tv ratio of 1.58. A total of 2306 DEGs were identified by RNA Seq between wild and cultivar and validation was done by quantitative reverse transcription polymerase chain reaction. In this study, we genotyped SNPs with a validation rate of 78.87% by High Resolution Melting (HRM) Assay. Conclusion In the present study, 1621genic-SSR and 1844 SNP markers were developed from immature seed transcriptome sequence of blackgram and 31 genic-SSR markers were used to study genetic variations among different blackgram accessions. Above developed markers contribute towards enriching available genomic resources for blackgram and aid in breeding programmes.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter