Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
341 result(s) for "Splice junctions"
Sort by:
A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis
Background Accurate and comprehensive annotation of transcript sequences is essential for transcript quantification and differential gene and transcript expression analysis. Single-molecule long-read sequencing technologies provide improved integrity of transcript structures including alternative splicing, and transcription start and polyadenylation sites. However, accuracy is significantly affected by sequencing errors, mRNA degradation, or incomplete cDNA synthesis. Results We present a new and comprehensive Arabidopsis thaliana Reference Transcript Dataset 3 (AtRTD3). AtRTD3 contains over 169,000 transcripts—twice that of the best current Arabidopsis transcriptome and including over 1500 novel genes. Seventy-eight percent of transcripts are from Iso-seq with accurately defined splice junctions and transcription start and end sites. We develop novel methods to determine splice junctions and transcription start and end sites accurately. Mismatch profiles around splice junctions provide a powerful feature to distinguish correct splice junctions and remove false splice junctions. Stratified approaches identify high-confidence transcription start and end sites and remove fragmentary transcripts due to degradation. AtRTD3 is a major improvement over existing transcriptomes as demonstrated by analysis of an Arabidopsis cold response RNA-seq time-series. AtRTD3 provides higher resolution of transcript expression profiling and identifies cold-induced differential transcription start and polyadenylation site usage. Conclusions AtRTD3 is the most comprehensive Arabidopsis transcriptome currently. It improves the precision of differential gene and transcript expression, differential alternative splicing, and transcription start/end site usage analysis from RNA-seq data. The novel methods for identifying accurate splice junctions and transcription start/end sites are widely applicable and will improve single-molecule sequencing analysis from any species.
Synonymous and non-synonymous variants at splice junctions can disrupt splicing and are frequently linked to disease associated loss of function genes
Background RNA splicing facilitated by the spliceosome complex, relies on specific motifs at exon-intron junctions of pre-mRNAs to generate mature mRNAs. Mutations in splice junctions can disrupt splicing, potentially leading to premature protein truncation. Nucleotides within the exonic component of the junction are also essential for splicing. Evaluation of silent and missense variants in the exonic splice junction on RNA splicing is essential to investigate the significance of such variants in disease pathogenesis. Methods We analyzed cancer-associated silent and missense variants reported in the COSMIC database that are located within three nucleotides of splice donor and acceptor sites. We examined the prevalence of these variants in genes for which loss of function is a known mechanism of disease development. We also studied the performance of splicing impact prediction tools and evaluated their clinical relevance as well as their alignment with experimentally validated splicing outcomes. Results Nucleotide composition analysis revealed a high preference for the nucleotide G at the donor 1 (d1) and acceptor (a1) positions, 87% and 69%, respectively. We observed a high prevalence of G > A and G > T variants at d1 and a1 positions. Interestingly, 66% to 86% of the identified variants at these positions are missense mutations, with G > T variants being specific for this type of mutation. Evolutionary conservation analysis indicates high nucleotide conservation for these positions at donor and acceptor sites, highlighting their importance at the nucleotide level. The frequently occurring variants are associated with tumor suppressor genes, and 58 of the top 100 genes have LOEUF scores below 1, indicating low tolerance to protein truncation. In contrast, such genes are rarely observed among population variants. Conclusions Our data driven computational study emphasizes the importance of evaluating silent and missense variants at splice junctions to understand their impact on RNA splicing. These variants may have a neutral effect on protein function. However, evaluating their effect at the RNA level is essential to understanding the significance of these variants in disease pathogenesis. This is particularly important for genes in which loss of function is the mechanism of disease development.
Discerning novel splice junctions derived from RNA-seq alignment: a deep learning approach
Background Exon splicing is a regulated cellular process in the transcription of protein-coding genes. Technological advancements and cost reductions in RNA sequencing have made quantitative and qualitative assessments of the transcriptome both possible and widely available. RNA-seq provides unprecedented resolution to identify gene structures and resolve the diversity of splicing variants. However, currently available ab initio aligners are vulnerable to spurious alignments due to random sequence matches and sample-reference genome discordance. As a consequence, a significant set of false positive exon junction predictions would be introduced, which will further confuse downstream analyses of splice variant discovery and abundance estimation. Results In this work, we present a deep learning based splice junction sequence classifier, named DeepSplice, which employs convolutional neural networks to classify candidate splice junctions. We show (I) DeepSplice outperforms state-of-the-art methods for splice site classification when applied to the popular benchmark dataset HS3D, (II) DeepSplice shows high accuracy for splice junction classification with GENCODE annotation, and (III) the application of DeepSplice to classify putative splice junctions generated by Rail-RNA alignment of 21,504 human RNA-seq data significantly reduces 43 million candidates into around 3 million highly confident novel splice junctions. Conclusions A model inferred from the sequences of annotated exon junctions that can then classify splice junctions derived from primary RNA-seq data has been implemented. The performance of the model was evaluated and compared through comprehensive benchmarking and testing, indicating a reliable performance and gross usability for classifying novel splice junctions derived from RNA-seq alignment.
Identification of Novel Circular RNAs of the Human Protein Arginine Methyltransferase 1 (PRMT1) Gene, Expressed in Breast Cancer Cells
Circular RNAs (circRNAs) constitute a type of RNA formed through back-splicing. In breast cancer, circRNAs are implicated in tumor onset and progression. Although histone methylation by PRMT1 is largely involved in breast cancer development and metastasis, the effect of circular transcripts deriving from this gene has not been examined. In this study, total RNA was extracted from four breast cancer cell lines and reversely transcribed using random hexamer primers. Next, first- and second-round PCRs were performed using gene-specific divergent primers. Sanger sequencing followed for the determination of the sequence of each novel PRMT1 circRNA. Lastly, bioinformatics analysis was conducted to predict the functions of the novel circRNAs. In total, nine novel circRNAs were identified, comprising both complete and truncated exons of the PRMT1 gene. Interestingly, we demonstrated that the back-splice junctions consist of novel splice sites of the PRMT1 exons. Moreover, the circRNA expression pattern differed among these four breast cancer cell lines. All the novel circRNAs are predicted to act as miRNA and/or protein sponges, while five circRNAs also possess an open reading frame. In summary, we described the complete sequence of nine novel circRNAs of the PRMT1 gene, comprising distinct back-splice junctions and probably having different molecular properties.
The Long Read Transcriptome of Rice (Oryza sativa ssp. japonica var. Nipponbare) Reveals Novel Transcripts
BackgroundHigh-throughput next-generation sequencing technologies offer a powerful approach to characterizing the transcriptomes of plants. Long read sequencing has been shown to support the discovery of novel isoforms of transcripts. This approach enables the generation of full-length sequences revealing splice variants that may be important in regulating gene action. Investigation of the diversity of transcripts in the rice transcriptome including splice variants was conducted using PacBio long-read sequence data to improve the annotation of the rice genome.ResultsA cDNA library was prepared from RNA extracted from leaves, roots, seeds, inflorescences, and panicles of O. sativa ssp. japonica var Nipponbare and sequenced on a PacBio Sequel platform. This produced 346,190 non-redundant full-length non-chimeric reads (FLNC) resulting in 33,504 high-quality transcripts. Half of the transcripts were multi-exonic and entirely matched with the reference transcripts. However, 14,874 novel isoforms were also identified resulting predominantly from intron retention and at least one novel splice site. Intron retention was the prevalent alternative splicing event and exon skipping was the least observed. Of 73,659 splice junctions, 12,755 (17%) represented novel splice junctions with canonical and non-canonical intron boundaries. The complexity of the transcriptome was examined in detail for 19 starch synthesis-related genes, defining 276 spliced isoforms of which 94 splice variants were novel.ConclusionThe data reveal the great complexity of the rice transcriptome. The novel transcripts provide new insights that may be a key input in future research to improve the annotation of the rice genome.
The U1 spliceosomal RNA is recurrently mutated in multiple cancers
Cancers are caused by genomic alterations known as drivers. Hundreds of drivers in coding genes are known but, to date, only a handful of noncoding drivers have been discovered—despite intensive searching 1 , 2 . Attention has recently shifted to the role of altered RNA splicing in cancer; driver mutations that lead to transcriptome-wide aberrant splicing have been identified in multiple types of cancer, although these mutations have only been found in protein-coding splicing factors such as splicing factor 3b subunit 1 ( SF3B1 ) 3 – 6 . By contrast, cancer-related alterations in the noncoding component of the spliceosome—a series of small nuclear RNAs (snRNAs)—have barely been studied, owing to the combined challenges of characterizing noncoding cancer drivers and the repetitive nature of snRNA genes 1 , 7 , 8 . Here we report a highly recurrent A>C somatic mutation at the third base of U1 snRNA in several types of tumour. The primary function of U1 snRNA is to recognize the 5′ splice site via base-pairing. This mutation changes the preferential A–U base-pairing between U1 snRNA and the 5′ splice site to C–G base-pairing, and thus creates novel splice junctions and alters the splicing pattern of multiple genes—including known drivers of cancer. Clinically, the A>C mutation is associated with heavy alcohol use in patients with hepatocellular carcinoma, and with the aggressive subtype of chronic lymphocytic leukaemia with unmutated immunoglobulin heavy-chain variable regions. The mutation in U1 snRNA also independently confers an adverse prognosis to patients with chronic lymphocytic leukaemia. Our study demonstrates a noncoding driver in spliceosomal RNAs, reveals a mechanism of aberrant splicing in cancer and may represent a new target for treatment. Our findings also suggest that driver discovery should be extended to a wider range of genomic regions. A highly recurrent A>C somatic mutation in U1 small nuclear RNA, which alters the splicing pattern of genes that include known drivers of cancer, is identified in several types of tumour.
In Silico Identification and Characterization of circRNAs as Potential Virulence-Related miRNA/siRNA Sponges from Entamoeba histolytica and Encystment-Related circRNAs from Entamoeba invadens
Ubiquitous eukaryotic non-coding circular RNAs regulate transcription and translation. We have reported full-length intronic circular RNAs (flicRNAs) in Entamoeba histolytica with esterified 3′ss and 5′ss. Their 5′ss GU-rich elements are essential for their biogenesis and their suggested role in transcription regulation. Here, we explored whether exonic, exonic-intronic, and intergenic circular RNAs are also part of the E. histolytica and E. invadens ncRNA RNAome and investigated their possible functions. Available RNA-Seq libraries were analyzed with the CIRI-full software in search of circular exonic RNAs (circRNAs). The robustness of the analyses was validated using synthetic decoy sequences with bona fide back splice junctions. Differentially expressed (DE) circRNAs, between the virulent HM1:IMSS and the nonvirulent Rahman E. histolytica strains, were identified, and their miRNA sponging potential was analyzed using the intaRNA software. Respectively, 188 and 605 reverse overlapped circRNAs from E. invadens and E. histolytica were identified. The sequence composition of the circRNAs was mostly exonic although different to human circRNAs in other attributes. 416 circRNAs from E. histolytica were virulent-specific and 267 were nonvirulent-specific. Out of the common circRNAs, 32 were DE between strains. Finally, we predicted that 8 of the DE circRNAs could function as sponges of the bioinformatically reported miRNAs in E. histolytica, whose functions are still unknown. Our results extend the E. histolytica RNAome and allow us to devise a hypothesis to test circRNAs/miRNAs/siRNAs interactions in determining the virulent/nonvirulent phenotypes and to explore other regulatory mechanisms during amoebic encystment.
Integrated analysis of genomic and transcriptomic data for the discovery of splice-associated variants in cancer
Somatic mutations within non-coding regions and even exons may have unidentified regulatory consequences that are often overlooked in analysis workflows. Here we present RegTools ( www.regtools.org ), a computationally efficient, free, and open-source software package designed to integrate somatic variants from genomic data with splice junctions from bulk or single cell transcriptomic data to identify variants that may cause aberrant splicing. We apply RegTools to over 9000 tumor samples with both tumor DNA and RNA sequence data. RegTools discovers 235,778 events where a splice-associated variant significantly increases the splicing of a particular junction, across 158,200 unique variants and 131,212 unique junctions. To characterize these somatic variants and their associated splice isoforms, we annotate them with the Variant Effect Predictor, SpliceAI, and Genotype-Tissue Expression junction counts and compare our results to other tools that integrate genomic and transcriptomic data. While many events are corroborated by the aforementioned tools, the flexibility of RegTools also allows us to identify splice-associated variants in known cancer drivers, such as TP53 , CDKN2A , and B2M , and other genes. Analysing the regulatory consequences of mutations and splice variants at large scale in cancer requires efficient computational tools. Here, the authors develop RegTools, a software package that can identify splice-associated variants from large-scale genomics and transcriptomics data with efficiency and flexibility.
Conservation of gene architecture and domains amidst sequence divergence in the hsrω lncRNA gene across the Drosophila genus: an in silico analysis
The developmentally active and cell-stress responsive hsrω locus in Drosophila melanogaster carries two exons, one omega intron, one short translatable open reading frame (ORFω), long stretch of unique tandem repeats and an overlapping mir-4951 near its 3′ end. It produces multiple long noncoding RNAs (lncRNAs) using two transcription start and four termination sites. Earlier cytogenetic studies revealed functional conservation of hsrω in several Drosophila species. However, sequence analysis in three species showed poor conservation for ORFω, tandem repeat and other regions while the 16 nt at 5′ and 60 nt at 3′ splice junctions of the omega intron, respectively, were found to be ultra-conserved. The present bioinformatic study using the splice-junction landmarks in D. melanogaster hsrω identified orthologues in publicly available 34 Drosophila species genomes. Each orthologue carries a short ORFω, ultra-conserved splice junctions of omega intron, repeat region, conserved 3′-end located at mir-4951, and syntenic neighbours. Multiple copies of conserved nonamer motifs are seen in the tandem repeat region, despite a high variability in the repeat sequences. Intriguingly, only the omega intron sequences in different species show evolutionary relationships matching the general phylogenetic history in the genus. Search in other known insect genomes did not reveal sequence homology although a locus with similar functional properties is suggested in Chironomus and Ceratitis genera. Amidst the high sequence divergence, the conserved organization of exons, ORFω and omega intron in this gene’s proximal part and tandem repeats in distal part across the Drosophila genus is remarkable and possibly reflects functional importance of higher order structure of hsrω lncRNAs and the small omega peptide.
Alternative splicing modulation by G-quadruplexes
Alternative splicing is central to metazoan gene regulation, but the regulatory mechanisms are incompletely understood. Here, we show that G-quadruplex (G4) motifs are enriched ~3-fold near splice junctions. The importance of G4s in RNA is emphasised by a higher enrichment for the non-template strand. RNA-seq data from mouse and human neurons reveals an enrichment of G4s at exons that were skipped following depolarisation induced by potassium chloride. We validate the formation of stable RNA G4s for three candidate splice sites by circular dichroism spectroscopy, UV-melting and fluorescence measurements. Moreover, we find that sQTLs are enriched at G4s, and a minigene experiment provides further support for their role in promoting exon inclusion. Analysis of >1,800 high-throughput experiments reveals multiple RNA binding proteins associated with G4s. Finally, exploration of G4 motifs across eleven species shows strong enrichment at splice sites in mammals and birds, suggesting an evolutionary conserved splice regulatory mechanism. Here the authors shows that G-quadruplexes, non-canonical DNA/RNA structures, can have a direct impact on alternative splicing and that binding of splicing regulators is affected by their presence.