Catalogue Search | MBRL

Extracellular vesicles, RNA sequencing, and bioinformatic analyses: Challenges, solutions, and recommendations

by Fullard, John F. , Heyliger, Simon O. , Saulsbury, Marilyn D. in Bioinformatics , Biomarkers , Biopsy

2024

Extracellular vesicles (EVs) are heterogeneous entities secreted by cells into their microenvironment and systemic circulation. Circulating EVs carry functional small RNAs and other molecular footprints from their cell of origin, and thus have evident applications in liquid biopsy, therapeutics, and intercellular communication. Yet, the complete transcriptomic landscape of EVs is poorly characterized due to critical limitations including variable protocols used for EV‐RNA extraction, quality control, cDNA library preparation, sequencing technologies, and bioinformatic analyses. Consequently, there is a gap in knowledge and the need for a standardized approach in delineating EV‐RNAs. Here, we address these gaps by describing the following points by (1) focusing on the large canopy of the EVs and particles (EVPs), which includes, but not limited to – exosomes and other large and small EVs, lipoproteins, exomeres/supermeres, mitochondrial‐derived vesicles, RNA binding proteins, and cell‐free DNA/RNA/proteins; (2) examining the potential functional roles and biogenesis of EVPs; (3) discussing various transcriptomic methods and technologies used in uncovering the cargoes of EVPs; (4) presenting a comprehensive list of RNA subtypes reported in EVPs; (5) describing different EV‐RNA databases and resources specific to EV‐RNA species; (6) reviewing established bioinformatics pipelines and novel strategies for reproducible EV transcriptomics analyses; (7) emphasizing the significant need for a gold standard approach in identifying EV‐RNAs across studies; (8) and finally, we highlight current challenges, discuss possible solutions, and present recommendations for robust and reproducible analyses of EVP‐associated small RNAs. Overall, we seek to provide clarity on the transcriptomics landscape, sequencing technologies, and bioinformatic analyses of EVP‐RNAs. Detailed portrayal of the current state of EVP transcriptomics will lead to a better understanding of how the RNA cargo of EVPs can be used in modern and targeted diagnostics and therapeutics. For the inclusion of different particles discussed in this article, we use the terms large/small EVs, non‐vesicular extracellular particles (NVEPs), EPs and EVPs as defined in MISEV guidelines by the International Society of Extracellular Vesicles (ISEV). Overview of the RNA landscape of EVPs. Most commonly studied small EVP subtypes ranging from a diameter of ~1 nm to > 200 nm (top), and the most common types of RNA found within EVPs ranging from miRNA, piRNA, tRNA, snRNA, YRNA, circRNA, snoRNA, lncRNAs, mRNA, and rRNA (bottom).

Journal Article

Share this book

Add to My Shelf

Target capture and genome skimming for plant diversity studies

by Pezzini, Flávia Fonseca , Kidner, Catherine A. , Nishii, Kanae in barcoding , bioinformatics , Chromosomes

2023

Recent technological advances in long‐read high‐throughput sequencing and assembly methods have facilitated the generation of annotated chromosome‐scale whole‐genome sequence data for evolutionary studies; however, generating such data can still be difficult for many plant species. For example, obtaining high‐molecular‐weight DNA is typically impossible for samples in historical herbarium collections, which often have degraded DNA. The need to fast‐freeze newly collected living samples to conserve high‐quality DNA can be complicated when plants are only found in remote areas. Therefore, short‐read reduced‐genome representations, such as target capture and genome skimming, remain important for evolutionary studies. Here, we review the pros and cons of each technique for non‐model plant taxa. We provide guidance related to logistics, budget, the genomic resources previously available for the target clade, and the nature of the study. Furthermore, we assess the available bioinformatic analyses, detailing best practices and pitfalls, and suggest pathways to combine newly generated data with legacy data. Finally, we explore the possible downstream analyses allowed by the type of data generated using each technique. We provide a practical guide to help researchers make the best‐informed choice regarding reduced genome representation for evolutionary studies of non‐model plants in cases where whole‐genome sequencing remains impractical.

Journal Article

Share this book

Add to My Shelf

Investigating structural variant, indel and single nucleotide polymorphism differentiation between locally adapted Atlantic salmon populations

by Institut de Biologie Intégrative et des Systèmes (IBIS) ; Université Laval [Québec] (ULaval) , Ecosystèmes, biodiversité, évolution [Rennes] (ECOBIO) ; Université de Rennes (UR)-Institut Ecologie et Environnement - CNRS Ecologie et Environnement (INEE-CNRS) ; Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Observatoire des sciences de l'environnement de Rennes (OSERen) ; Université de Rennes (UR)-Institut national des sciences de l'Univers (INSU - CNRS)-Université de Rennes 2 (UR2)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE)-Institut national des sciences de l'Univers (INSU - CNRS)-Université de Rennes 2 (UR2)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE) , Mérot, Claire in Adaptation , Biodiversity and Ecology , Chromosomes

2024

Genomic structural variants (SVs) are now recognized as an integral component of intraspecific polymorphism and are known to contribute to evolutionary processes in various organisms. However, they are inherently difficult to detect and genotype from readily available short-read sequencing data, and therefore remain poorly documented in wild populations. Salmonid species displaying strong interpopulation variability in both life history traits and habitat characteristics, such as Atlantic salmon (Salmo salar), offer a prime context for studying adaptive polymorphism, but the contribution of SVs to fine-scale local adaptation has yet to be explored. Here, we performed a comparative analysis of SVs, single nucleotide polymorphisms (SNPs) and small indels (<50 bp) segregating in the Romaine and Puyjalon salmon, two putatively locally adapted populations inhabiting neighboring rivers (Québec, Canada) and showing pronounced variation in life history traits, namely growth, fecundity, and age at maturity and smoltification. We first catalogued polymorphism using a hybrid SV characterization approach pairing both short- (16X) and long-read sequencing (20X) for variant discovery with graph-based genotyping of SVs across 60 salmon genomes, along with characterization of SNPs and small indels from short reads. We thus identified 115,907 SVs, 8,777,832 SNPs and 1,089,321 short indels, with SVs covering 4.8 times more base pairs than SNPs. All three variant types revealed a highly congruent population structure and similar patterns of F (ST) and density variation along the genome. Finally, we performed outlier detection and redundancy analysis (RDA) to identify variants of interest in the putative local adaptation of Romaine and Puyjalon salmon. Genes located near these variants were enriched for biological processes related to nervous system function, suggesting that observed variation in traits such as age at smoltification could arise from differences in neural development. This study therefore demonstrates the feasibility of large-scale SV characterization and highlights its relevance for salmonid population genomics.

Journal Article

Share this book

Add to My Shelf

Sequencing the CYP2D6 Gene: From Variant Allele Discovery to Clinical Pharmacogenetic Testing

by Yang, Yao , Scott, Erick R , Scott, Stuart A in Alleles , Consortia , Copy number

2017

CYP2D6 is one of the most studied enzymes in the field of pharmacogenetics. The CYP2D6 gene is highly polymorphic with over 100 catalogued star (*) alleles, and clinical CYP2D6 testing is increasingly accessible and supported by practice guidelines. However, the degree of variation at the CYP2D6 locus and homology with its pseudogenes make interrogating CYP2D6 by short-read sequencing challenging. Moreover, accurate prediction of CYP2D6 metabolizer status necessitates analysis of duplicated alleles when an increased copy number is detected. These challenges have recently been overcome by long-read CYP2D6 sequencing; however, such platforms are not widely available. This review highlights the genomic complexities of CYP2D6, current sequencing methods and the evolution of CYP2D6 from allele discovery to clinical pharmacogenetic testing.

Journal Article

Share this book

Add to My Shelf

A novel FAME1 repeat configuration in a European family identified using a combined genomics approach

by Tsai, Meng‐Han , Klein, Karl Martin , Kaya, Sabine in Adult , Convulsions & seizures , CRISPR

2023

Familial adult myoclonic epilepsy (FAME) is an adult‐onset neurological disease characterized by cortical tremor, myoclonus, and seizures due to a pentanucleotide repeat expansion: a combination of pathogenic TTTCA expansion associated with a TTTTA repeat in introns of six different genes. Repeat‐primed PCR (RP‐PCR) is an inexpensive test for expansions at known loci. The analysis of the SAMD12 locus revealed that the repeats have different size, configuration, and composition. The TTTCA repeats can be very long (>1000 repeats) but also very short (14 being the shortest identified). Here, we report siblings of European descent with the clinical diagnosis of FAME yet a negative RP‐PCR test. Using short‐read genome sequencing, we identified the pentanucleotide expansion in intron 4 of SAMD12, which was confirmed by CRIPSR‐Cas9‐mediated enrichment and long‐read sequencing to be of (TTTTA)~879(TTTCA)3(TTTTA)7(TTTCA)7 configuration. Our finding is the first to associate the SAMD12 locus in European patients with FAME and currently represents the shortest identified TTTCA expansion. Our results suggest that the SAMD12 locus should be tested in patients with suspected FAME independent of ethnicity. Furthermore, RP‐PCR may miss the underlying mutation, and genome sequencing may be needed to confirm the pathogenic repeat.

Journal Article

Share this book

Add to My Shelf

Progress in Methods for Copy Number Variation Profiling

by Gordeeva, Veronika , Sharova, Elena , Arapidi, Georgij in Arrays , Artificial chromosomes , Cell division

2022

Copy number variations (CNVs) are the predominant class of structural genomic variations involved in the processes of evolutionary adaptation, genomic disorders, and disease progression. Compared with single-nucleotide variants, there have been challenges associated with the detection of CNVs owing to their diverse sizes. However, the field has seen significant progress in the past 20–30 years. This has been made possible due to the rapid development of molecular diagnostic methods which ensure a more detailed view of the genome structure, further complemented by recent advances in computational methods. Here, we review the major approaches that have been used to routinely detect CNVs, ranging from cytogenetics to the latest sequencing technologies, and then cover their specific features.

Journal Article

Share this book

Add to My Shelf

Are short‐read amplicons suitable for the prediction of microbiome functional potential? A critical perspective

by Beule, Lukas , Heidrich, Vitor in Algorithms , compositional data , Datasets

2022

Taxonomic marker gene analysis allows uncovering taxonomic profiles of microbial communities at low cost, making it omnipresent in microbiome research. There is an ever‐expanding set of tools to extract further biological information from this kind of data. In this perspective, we enunciate several concerns regarding the biological validity of predicting functional potential from taxonomic profiles, especially when they are generated by short‐read sequencing. The taxonomic resolution of marker genes, intragenomic variability of marker genes, and the compositional nature of microbiome data are discussed. Combining actual measurements of microbiome functions with predicted functional potentials is proposed as a powerful approach to better understand microbiome functioning. In this context, the significance of predicted functional potentials for generating and testing hypotheses is highlighted. We argue that functions of microbiomes predicted from microbiome DNA read count data generated by short‐read amplicon sequencing should not serve as the only basis to draw biological inferences. We raise concerns regarding the biological validity of predicting functional potential from microbiome taxonomic profiles generated by short‐read amplicon sequencing. We reason that predicted functional potential profiles can be improved by employing long‐read sequencing technologies, which in combination with independent measurements of actual functions can constitute a powerful approach to understanding microbiome functioning. Highlights Concerns regarding the biological validity of predicting functional potential from taxonomic profiles of microbiome data sets are enunciated. Taxonomic resolution of marker genes, intragenomic variability of marker genes, and the compositional nature of microbiome data are discussed. Combining measurements of actual functions and predicted functional potential profiles is a powerful approach to understanding microbial functioning.

Journal Article

Share this book

Add to My Shelf

Recent advances in the detection of repeat expansions with short-read next-generation sequencing

by Tankard, Rick M , Lockhart, Paul J , Bennett, Mark F in Review

2018

Short tandem repeats (STRs), also known as microsatellites, are commonly defined as consisting of tandemly repeated nucleotide motifs of 2–6 base pairs in length. STRs appear throughout the human genome, and about 239,000 are documented in the Simple Repeats Track available from the UCSC (University of California, Santa Cruz) genome browser. STRs vary in size, producing highly polymorphic markers commonly used as genetic markers. A small fraction of STRs (about 30 loci) have been associated with human disease whereby one or both alleles exceed an STR-specific threshold in size, leading to disease. Detection of repeat expansions is currently performed with polymerase chain reaction–based assays or with Southern blots for large expansions. The tests are expensive and time-consuming and are not always conclusive, leading to lengthy diagnostic journeys for patients, potentially including missed diagnoses. The advent of whole exome and whole genome sequencing has identified the genetic cause of many genetic disorders; however, analysis pipelines are focused primarily on the detection of short nucleotide variations and short insertions and deletions (indels). Until recently, repeat expansions, with the exception of the smallest expansion (SCA6), were not detectable in next-generation short-read sequencing datasets and would have been ignored in most analyses. In the last two years, four analysis methods with accompanying software (ExpansionHunter, exSTRa, STRetch, and TREDPARSE) have been released. Although a comprehensive comparative analysis of the performance of these methods across all known repeat expansions is still lacking, it is clear that these methods are a valuable addition to any existing analysis pipeline. Here, we detail how to assess short-read data for evidence of expansions, reviewing all four methods and outlining their strengths and weaknesses. Implementation of these methods should lead to increased diagnostic yield of repeat expansion disorders for known STR loci and has the potential to detect novel repeat expansions.

Journal Article

Share this book

Add to My Shelf

Methodological Comparison of Short-Read and Long-Read Sequencing Methods on Colorectal Cancer Samples

by Molnár, Béla , Kalmár, Alexandra , Rada, Kristóf Róbert in Accuracy , Analysis , Colorectal cancer

2025

Colorectal cancer (CRC) is driven by a complex spectrum of somatic mutations and structural variants that contribute to tumor heterogeneity and therapy resistance. In this study, we performed a comparative analysis of short-read Illumina and long-read Nanopore sequencing technologies across multiple CRC sample groups, encompassing diverse tissue morphologies. Our evaluation included general base-level metrics—such as nucleotide ratios, sequence match rates, and coverage—as well as variant calling performance, including variant allele frequency (VAF) distributions and pathogenic mutation detection rates. Focusing on clinically relevant genes (KRAS, BRAF, TP53, APC, PIK3CA, and others), we characterized platform-specific detection profiles and completed the ground truth validation of somatic KRAS and BRAF mutations. Structural variant (SV) analysis revealed Nanopore’s enhanced ability to resolve large and complex rearrangements, with consistently high precision across SV types, though recall varied by variant class and size. To enable direct comparison with the Illumina exome panel, we applied an exonic position reference file. To assess the impact of depth and PCR amplification, we completed an additional high-coverage Nanopore sequencing run. This analysis confirmed that PCR-free protocols preserve methylation signals more accurately, reinforcing Nanopore’s utility for integrated genomic and epigenomic profiling. Together, these findings underscore the complementary strengths of short- and long-read sequencing platforms in high-resolution cancer genomics, and we highlight the importance of coverage normalization, epigenetic fidelity, and rigorous benchmarking in variant discovery.

Journal Article

Share this book

Add to My Shelf

Investigating the Performance of Oxford Nanopore Long-Read Sequencing with Respect to Illumina Microarrays and Short-Read Sequencing

by Williams, Alexander , Baffour-Kyei, Anastasia , Breen, Gerome in Accuracy , Analysis , Bar codes

2025

Oxford Nanopore Technologies (ONT) long-read sequencing (LRS) has emerged as a promising genomic analysis tool, yet comprehensive benchmarks with established platforms across diverse datasets remain limited. This study aimed to benchmark LRS performance against Illumina short-read sequencing (SRS) and microarrays for variant detection across different genomic contexts and to evaluate the impact of experimental factors. We sequenced 14 human genomes using the three platforms and evaluated single nucleotide variants (SNVs), insertions/deletions (indels), and structural variants (SVs) detection, stratifying by high-complexity, low-complexity, and dark genome regions while assessing effects of multiplexing, depth, and read length. LRS SNV accuracy was slightly lower than that of SRS in high-complexity regions (F-measure: 0.954 vs. 0.967) but showed comparable sensitivity in low-complexity regions. LRS showed robust performance for small (1–5 bp) indels in high-complexity regions (F-measure: 0.869), but SRS agreement decreased significantly in low-complexity regions and for larger indel sizes. Within dark regions, LRS identified more indels than SRS, but showed lower base-level accuracy. LRS identified 2.86 times more SVs than SRS, excelling at detecting large variants (>6 kb), with SV detection improving with sequencing depth. Sequencing depth strongly influenced variant calling performance, whereas multiplexing effects were minimal. Our findings provide valuable insights for optimising LRS applications in genomic research and diagnostics.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter