Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
123 result(s) for "direct RNA sequencing"
Sort by:
LAFITE Reveals the Complexity of Transcript Isoforms in Subcellular Fractions
Characterization of the subcellular distribution of RNA is essential for understanding the molecular basis of biological processes. Here, the subcellular nanopore direct RNA‐sequencing (DRS) of four lung cancer cell lines (A549, H1975, H358, and HCC4006) is performed, coupled with a computational pipeline, Low‐abundance Aware Full‐length Isoform clusTEr (LAFITE), to comprehensively analyze the full‐length cytoplasmic and nuclear transcriptome. Using additional DRS and orthogonal data sets, it is shown that LAFITE outperforms current methods for detecting full‐length transcripts, particularly for low‐abundance isoforms that are usually overlooked due to poor read coverage. Experimental validation of six novel isoforms exclusively identified by LAFITE further confirms the reliability of this pipeline. By applying LAFITE to subcellular DRS data, the complexity of the nuclear transcriptome is revealed in terms of isoform diversity, 3'‐UTR usage, m6A modification patterns, and intron retention. Overall, LAFITE provides enhanced full‐length isoform identification and enables a high‐resolution view of the RNA landscape at the isoform level. Characterization of the subcellular distribution of RNA is essential for understanding the molecular basis of biological processes. By performing a subcellular nanopore direct RNA sequencing, coupled with a full‐length isoform detection pipeline, LAFITE, the authors highlight the divergence between cytoplasmic and nuclear transcriptome, in terms of isoform diversity, 3'‐UTR usage, m6A modification patterns, and alternative splicing.
Sequencing accuracy and systematic errors of nanopore direct RNA sequencing
Background Direct RNA sequencing (dRNA-seq) on the Oxford Nanopore Technologies (ONT) platforms can produce reads covering up to full-length gene transcripts, while containing decipherable information about RNA base modifications and poly-A tail lengths. Although many published studies have been expanding the potential of dRNA-seq, its sequencing accuracy and error patterns remain understudied. Results We present the first comprehensive evaluation of sequencing accuracy and characterisation of systematic errors in dRNA-seq data from diverse organisms and synthetic in vitro transcribed RNAs. We found that for sequencing kits SQK-RNA001 and SQK-RNA002, the median read accuracy ranged from 87% to 92% across species, and deletions significantly outnumbered mismatches and insertions. Due to their high abundance in the transcriptome, heteropolymers and short homopolymers were the major contributors to the overall sequencing errors. We also observed systematic biases across all species at the levels of single nucleotides and motifs. In general, cytosine/uracil-rich regions were more likely to be erroneous than guanines and adenines. By examining raw signal data, we identified the underlying signal-level features potentially associated with the error patterns and their dependency on sequence contexts. While read quality scores can be used to approximate error rates at base and read levels, failure to detect DNA adapters may be a source of errors and data loss. By comparing distinct basecallers, we reason that some sequencing errors are attributable to signal insufficiency rather than algorithmic (basecalling) artefacts. Lastly, we generated dRNA-seq data using the latest SQK-RNA004 sequencing kit released at the end of 2023 and found that although the overall read accuracy increased, the systematic errors remain largely identical compared to the previous kits. Conclusions As the first systematic investigation of dRNA-seq errors, this study offers a comprehensive overview of reproducible error patterns across diverse datasets, identifies potential signal-level insufficiency, and lays the foundation for error correction methods.
Third-Generation Sequencing: The Spearhead towards the Radical Transformation of Modern Genomics
Although next-generation sequencing (NGS) technology revolutionized sequencing, offering a tremendous sequencing capacity with groundbreaking depth and accuracy, it continues to demonstrate serious limitations. In the early 2010s, the introduction of a novel set of sequencing methodologies, presented by two platforms, Pacific Biosciences (PacBio) and Oxford Nanopore Sequencing (ONT), gave birth to third-generation sequencing (TGS). The innovative long-read technologies turn genome sequencing into an ease-of-handle procedure by greatly reducing the average time of library construction workflows and simplifying the process of de novo genome assembly due to the generation of long reads. Long sequencing reads produced by both TGS methodologies have already facilitated the decipherment of transcriptional profiling since they enable the identification of full-length transcripts without the need for assembly or the use of sophisticated bioinformatics tools. Long-read technologies have also provided new insights into the field of epitranscriptomics, by allowing the direct detection of RNA modifications on native RNA molecules. This review highlights the advantageous features of the newly introduced TGS technologies, discusses their limitations and provides an in-depth comparison regarding their scientific background and available protocols as well as their potential utility in research and clinical applications.
DeepEdit: single-molecule detection and phasing of A-to-I RNA editing events using nanopore direct RNA sequencing
Single-molecule detection and phasing of A-to-I RNA editing events remain an unresolved problem. Long-read and PCR-free nanopore native RNA sequencing offers a great opportunity for direct RNA editing detection. Here, we develop a neural network model, DeepEdit, that not only recognizes A-to-I editing events in single reads of Oxford Nanopore direct RNA sequencing, but also resolves the phasing of RNA editing events on transcripts. We illustrate the robustness of DeepEdit by applying it to Schizosaccharomyces pombe and Homo sapiens transcriptome data. We anticipate DeepEdit to be a powerful tool for the study of RNA editing from a new perspective.
Poly(a) selection introduces bias and undue noise in direct RNA-sequencing
Background Genome-wide RNA-sequencing technologies are increasingly critical to a wide variety of diagnostic and research applications. RNA-seq users often first enrich for mRNA, with the most popular enrichment method being poly(A) selection. In many applications it is well-known that poly(A) selection biases the view of the transcriptome by selecting for longer tailed mRNA species. Results Here, we show that poly(A) selection biases Oxford Nanopore direct RNA sequencing. As expected, poly(A) selection skews sequenced mRNAs toward longer poly(A) tail lengths. Interestingly, we identify a population of mRNAs (> 10% of genes’ mRNAs) that are inconsistently captured by poly(A) selection due to highly variable poly(A) tails, and demonstrate this phenomenon in our hands and in published data. Importantly, we show poly(A) selection is dispensable for Oxford Nanopore’s direct RNA-seq technique, and demonstrate successful library construction without poly(A) selection, with decreased input, and without loss of quality. Conclusions Our work expands the utility of direct RNA-seq by validating the use of total RNA as input, and demonstrates important technical artifacts from poly(A) selection that inconsistently skew mRNA expression and poly(A) tail length measurements.
Novel insights into the Leishmania infantum transcriptome diversity of protein-coding and non-coding sequences in both stages of parasite development using nanopore direct RNA sequencing
Background Leishmania relies on posttranscriptional control to regulate gene expression. Protein-coding genes are synthesised as polycistronic precursors that are processed into individual mRNAs by trans -splicing adding the spliced leader (SL) RNA to the 5’-end and 3’ cleavage-polyadenylation. Here, we employ Nanopore direct RNA sequencing (DRS) combined with Illumina RNA-Seq to comprehensively interrogate the transcriptomes of Leishmania infantum developmental stages at single-molecule resolution. Results Analysis of DRS full-length reads of poly(A)+-enriched RNA from L. infantum developmental stages enabled us to precisely determine the primary SL and poly(A) sites for 52% of the protein-coding transcripts and to accurately define their 5’- and 3’-end and the length of UTRs. In addition, our analysis confirmed the motifs ‘[C/A/T] A|G’ being associated with 94.8% of the SL cleavage sites and better defined the genomic context for cleavage and polyadenylation. Overall, we observed more diversity for poly(A) than SL sites per transcript. The frequency of the primary SL and poly(A) sites was 64.2% and 24%, respectively, with most transcripts having additional poly(A) sites nearby. Alternative polyadenylation was detected in 11-13% of transcripts with ~ 20% of these having different primary poly(A) sites between promastigote and amastigote developmental stages. Furthermore, DRS uncovered multiple processing events occurring mostly within 3’UTRs, leading to the formation of long non-coding RNAs (lncRNAs). The L. infantum transcriptome expresses a rich repertoire of 1,825 lncRNAs, of which 98% were not previously annotated in L. infantum and only 21.5% were found in L. major . These lncRNAs exhibit generally distinct expression patterns from the 3’UTRs they derived and several are developmentally regulated, representing ~ 27% of the L. infantum stage-regulated transcriptome. Their expression was generally higher in amastigotes than in promastigotes, highlighting their importance in parasite intracellular development. Protein prediction tools combined to mass-spectrometry revealed that 7.6% of these lncRNAs have a limited protein-coding potential. Conclusions This is the first comprehensive transcriptomic analysis of L. infantum developmental stages using single-molecule Nanopore DRS. Our findings advance knowledge on existing Leishmania expression datasets and provide new insights into the transcriptome complexity and dynamics of both protein-coding and non-coding sequences throughout the parasite development.
Utilizing Nanopore direct RNA sequencing of blood from patients with sepsis for discovery of co- and post-transcriptional disease biomarkers
Background RNA sequencing of whole blood has been increasingly employed to find transcriptomic signatures of disease states. These studies traditionally utilize short-read sequencing of cDNA, missing important aspects of RNA expression such as differential isoform abundance and poly(A) tail length variation. Methods We used Oxford Nanopore Technologies sequencing to sequence native mRNA extracted from whole blood from 12 patients with definite bacterial and viral sepsis and compared with results from matching Illumina short-read cDNA sequencing data. Additionally, we explored poly(A) tail length variation, novel transcript identification, and differential transcript usage. Results The correlation of gene count data between Illumina cDNA- and Nanopore RNA-sequencing strongly depended on the choice of analysis pipeline; NanoCount for Nanopore and Kallisto for Illumina data yielded the highest mean Pearson’s correlation of 0.927 at the gene level and 0.736 at the transcript isoform level. We identified 2 genes with differential polyadenylation, 9 genes with differential expression and 4 genes with differential transcript usage between bacterial and viral infection. Gene ontology gene set enrichment analysis of poly(A) tail length revealed enrichment of long tails in mRNA of genes involved in signaling and short tails in oxidoreductase molecular functions. Additionally, we detected 240 non-artifactual novel transcript isoforms. Conclusions Nanopore RNA- and Illumina cDNA-gene counts are strongly correlated, indicating that both platforms are suitable for discovery and validation of gene count biomarkers. Nanopore direct RNA-seq provides additional advantages by uncovering additional post- and co-transcriptional biomarkers, such as poly(A) tail length variation and transcript isoform usage.
DEMINERS enables clinical metagenomics and comparative transcriptomic analysis by increasing throughput and accuracy of nanopore direct RNA sequencing
Nanopore direct RNA sequencing (DRS) is a powerful tool for RNA biology but suffers from low basecalling accuracy, low throughput, and high input requirements. We present DEMINERS, a novel DRS toolkit combining an RNA multiplexing workflow, a Random Forest-based barcode classifier, and an optimized convolutional neural network basecaller with species-specific training. DEMINERS enables accurate demultiplexing of up to 24 samples, reducing RNA input and runtime. Applications include clinical metagenomics, cancer transcriptomics, and parallel transcriptomic comparisons, uncovering microbial diversity in COVID-19 and m 6 A’s role in malaria and glioma. DEMINERS offers a robust, high-throughput solution for precise transcript and RNA modification analysis.
Template-switching artifacts resemble alternative polyadenylation
Background Alternative polyadenylation is commonly examined using cDNA sequencing, which is known to be affected by template-switching artifacts. However, the effects of such template-switching artifacts on alternative polyadenylation are generally disregarded, while alternative polyadenylation artifacts are attributed to internal priming. Results Here, we analyzed both long-read cDNA sequencing and direct RNA sequencing data of two organisms, generated by different sequencing platforms. We developed a filtering algorithm which takes into consideration that template-switching can be a source of artifactual polyadenylation when filtering out spurious polyadenylation sites. The algorithm outperformed the conventional internal priming filters based on comparison to direct RNA sequencing data. We also showed that the polyadenylation artifacts arise in cDNA sequencing at consecutive stretches of as few as three adenines. There was no substantial difference between the lengths of poly(A) tails at the artifactual and the true transcriptional end sites even though it is expected that internal priming artifacts have shorter poly(A) tails than genuine polyadenylated reads. Conclusions Our findings suggest that template switching plays an important role in the generation of spurious polyadenylation and support the need for more rigorous filtering of artifactual polyadenylation sites in cDNA data, or that alternative polyadenylation should be annotated using native RNA sequencing.
Pervasive generation of non-canonical subgenomic RNAs by SARS-CoV-2
Background SARS-CoV-2, a positive-sense RNA virus in the family Coronaviridae , has caused a worldwide pandemic of coronavirus disease 2019 or COVID-19. Coronaviruses generate a tiered series of subgenomic RNAs (sgRNAs) through a process involving homology between transcriptional regulatory sequences (TRS) located after the leader sequence in the 5′ UTR (the TRS-L) and TRS located near the start of ORFs encoding structural and accessory proteins (TRS-B) near the 3′ end of the genome. In addition to the canonical sgRNAs generated by SARS-CoV-2, non-canonical sgRNAs (nc-sgRNAs) have been reported. However, the consistency of these nc-sgRNAs across viral isolates and infection conditions is unknown. The comprehensive definition of SARS-CoV-2 RNA products is a key step in understanding SARS-CoV-2 pathogenesis. Methods Here, we report an integrative analysis of eight independent SARS-CoV-2 transcriptomes generated using three sequencing strategies, five host systems, and seven viral isolates. Read-mapping to the SARS-CoV-2 genome was used to determine the 5′ and 3′ coordinates of all junctions in viral RNAs identified in these samples. Results Using junctional abundances, we show nc-sgRNAs make up as much as 33% of total sgRNAs in cell culture models of infection, are largely consistent in abundance across independent transcriptomes, and increase in abundance over time during infection. By assessing the homology between sequences flanking the 5′ and 3′ junction points, we show that nc-sgRNAs are not associated with TRS-like homology. By incorporating read coverage information, we find strong evidence for subgenomic RNAs that contain only 5′ regions of ORF1a. Finally, we show that non-canonical junctions change the landscape of viral open reading frames. Conclusions We identify canonical and non-canonical junctions in SARS-CoV-2 sgRNAs and show that these RNA products are consistently generated by many independent viral isolates and sequencing approaches. These analyses highlight the diverse transcriptional activity of SARS-CoV-2 and offer important insights into SARS-CoV-2 biology.