Catalogue Search | MBRL

Nanopore direct RNA sequencing maps the complexity of Arabidopsis mRNA processing and m6A modification

by Gould, Peter D , Barton, Geoffrey J , Sherwood, Anna V in 3' Untranslated regions , Annotations , Arabidopsis

2020

Understanding genome organization and gene regulation requires insight into RNA transcription, processing and modification. We adapted nanopore direct RNA sequencing to examine RNA from a wild-type accession of the model plant Arabidopsis thaliana and a mutant defective in mRNA methylation (m6A). Here we show that m6A can be mapped in full-length mRNAs transcriptome-wide and reveal the combinatorial diversity of cap-associated transcription start sites, splicing events, poly(A) site choice and poly(A) tail length. Loss of m6A from 3’ untranslated regions is associated with decreased relative transcript abundance and defective RNA 3′ end formation. A functional consequence of disrupted m6A is a lengthening of the circadian period. We conclude that nanopore direct RNA sequencing can reveal the complexity of mRNA processing and modification in full-length single molecule reads. These findings can refine Arabidopsis genome annotation. Further, applying this approach to less well-studied species could transform our understanding of what their genomes encode.

Journal Article

Share this book

Add to My Shelf

Missense variants in human ACE2 strongly affect binding to SARS-CoV-2 Spike providing a mechanism for ACE2 mediated genetic risk in Covid-19: A case study in affinity predictions of interface variants

by van der Merwe, P. Anton , Kutuzov, Mikhail , Barton, Geoffrey J. in ACE2 , Affinity , Angiotensin

2022

SARS-CoV-2 Spike (Spike) binds to human angiotensin-converting enzyme 2 (ACE2) and the strength of this interaction could influence parameters relating to virulence. To explore whether population variants in ACE2 influence Spike binding and hence infection, we selected 10 ACE2 variants based on affinity predictions and prevalence in gnomAD and measured their affinities and kinetics for Spike receptor binding domain through surface plasmon resonance (SPR) at 37°C. We discovered variants that reduce and enhance binding, including three ACE2 variants that strongly inhibited (p.Glu37Lys, ΔΔG = –1.33 ± 0.15 kcal mol -1 and p.Gly352Val, predicted ΔΔG = –1.17 kcal mol -1 ) or abolished (p.Asp355Asn) binding. We also identified two variants with distinct population distributions that enhanced affinity for Spike. ACE2 p.Ser19Pro (ΔΔG = 0.59 ± 0.08 kcal mol -1 ) is predominant in the gnomAD African cohort (AF = 0.003) whilst p.Lys26Arg (ΔΔG = 0.26 ± 0.09 kcal mol -1 ) is predominant in the Ashkenazi Jewish (AF = 0.01) and European non-Finnish (AF = 0.006) cohorts. We compared ACE2 variant affinities to published SARS-CoV-2 pseudotype infectivity data and confirmed that ACE2 variants with reduced affinity for Spike can protect cells from infection. The effect of variants with enhanced Spike affinity remains unclear, but we propose a mechanism whereby these alleles could cause greater viral spreading across tissues and cell types, as is consistent with emerging understanding regarding the interplay between receptor affinity and cell-surface abundance. Finally, we compared mCSM-PPI2 ΔΔG predictions against our SPR data to assess the utility of predictions in this system. We found that predictions of decreased binding were well-correlated with experiment and could be improved by calibration, but disappointingly, predictions of highly enhanced binding were unreliable. Recalibrated predictions for all possible ACE2 missense variants at the Spike interface were calculated and used to estimate the overall burden of ACE2 variants on Covid-19.

Journal Article

Share this book

Add to My Shelf

Identifying plant genes shaping microbiota composition in the barley rhizosphere

by Alegria Terrazas, Rodrigo , Abbott, James , Escudero-Martinez, Carmen in 101/58 , 38/77 , 38/91

2022

A prerequisite to exploiting soil microbes for sustainable crop production is the identification of the plant genes shaping microbiota composition in the rhizosphere, the interface between roots and soil. Here, we use metagenomics information as an external quantitative phenotype to map the host genetic determinants of the rhizosphere microbiota in wild and domesticated genotypes of barley, the fourth most cultivated cereal globally. We identify a small number of loci with a major effect on the composition of rhizosphere communities. One of those, designated the QRMC-3HS , emerges as a major determinant of microbiota composition. We subject soil-grown sibling lines harbouring contrasting alleles at QRMC-3HS and hosting contrasting microbiotas to comparative root RNA-seq profiling. This allows us to identify three primary candidate genes, including a Nucleotide-Binding-Leucine-Rich-Repeat ( NLR ) gene in a region of structural variation of the barley genome. Our results provide insights into the footprint of crop improvement on the plant’s capacity of shaping rhizosphere microbes. A prerequisite to exploiting soil microbes for sustainable crop production is the identification of the plant genes shaping microbiota composition in the rhizosphere. Here, the authors report QTLs and the associated candidate genes underlying rhizosphere microbiome composition in barley.

Journal Article

Share this book

Add to My Shelf

NoD: a Nucleolar localization sequence detector for eukaryotic and viral proteins

by Troshin, Peter V , Scott, Michelle S , Barton, Geoffrey J in Algorithms , Amino Acid Motifs , Amino Acid Sequence

2011

Background Nucleolar localization sequences (NoLSs) are short targeting sequences responsible for the localization of proteins to the nucleolus. Given the large number of proteins experimentally detected in the nucleolus and the central role of this subnuclear compartment in the cell, NoLSs are likely to be important regulatory elements controlling cellular traffic. Although many proteins have been reported to contain NoLSs, the systematic characterization of this group of targeting motifs has only recently been carried out. Results Here, we describe NoD, a web server and a command line program that predicts the presence of NoLSs in proteins. Using the web server, users can submit protein sequences through the NoD input form and are provided with a graphical output of the NoLS score as a function of protein position. While the web server is most convenient for making prediction for just a few proteins, the command line version of NoD can return predictions for complete proteomes. NoD is based on our recently described human-trained artificial neural network predictor. Through stringent independent testing of the predictor using available experimentally validated NoLS-containing eukaryotic and viral proteins, the NoD sensitivity and positive predictive value were estimated to be 71% and 79% respectively. Conclusions NoD is the first tool to provide predictions of nucleolar localization sequences in diverse eukaryotes and viruses. NoD can be run interactively online at http://www.compbio.dundee.ac.uk/nod or downloaded to use locally.

Journal Article

Share this book

Add to My Shelf

Comparative evaluation of methods for the prediction of protein–ligand binding sites

by Utgés, Javier S. , Barton, Geoffrey J. in Aggregates , Amino acids , Benchmark

2024

The accurate identification of protein–ligand binding sites is of critical importance in understanding and modulating protein function. Accordingly, ligand binding site prediction has remained a research focus for over three decades with over 50 methods developed and a change of paradigm from geometry-based to machine learning. In this work, we collate 13 ligand binding site predictors, spanning 30 years, focusing on the latest machine learning-based methods such as VN-EGNN, IF-SitePred, GrASP, PUResNet, and DeepPocket and compare them to the established P2Rank, PRANK and fpocket and earlier methods like PocketFinder, Ligsite and Surfnet. We benchmark the methods against the human subset of our new curated reference dataset, LIGYSIS. LIGYSIS is a comprehensive protein–ligand complex dataset comprising 30,000 proteins with bound ligands which aggregates biologically relevant unique protein–ligand interfaces across biological units of multiple structures from the same protein. LIGYSIS is an improvement for testing methods over earlier datasets like sc-PDB, PDBbind, binding MOAD, COACH420 and HOLO4K which either include 1:1 protein–ligand complexes or consider asymmetric units. Re-scoring of fpocket predictions by PRANK and DeepPocket display the highest recall (60%) whilst IF-SitePred presents the lowest recall (39%). We demonstrate the detrimental effect that redundant prediction of binding sites has on performance as well as the beneficial impact of stronger pocket scoring schemes, with improvements up to 14% in recall (IF-SitePred) and 30% in precision (Surfnet). Finally, we propose top- N +2 recall as the universal benchmark metric for ligand binding site prediction and urge authors to share not only the source code of their methods, but also of their benchmark. Scientific contributions This study conducts the largest benchmark of ligand binding site prediction methods to date, comparing 13 original methods and 15 variants using 10 informative metrics. The LIGYSIS dataset is introduced, which aggregates biologically relevant protein–ligand interfaces across multiple structures of the same protein. The study highlights the detrimental effect of redundant binding site prediction and demonstrates significant improvement in recall and precision through stronger scoring schemes. Finally, top- N +2 recall is proposed as a universal benchmark metric for ligand binding site prediction, with a recommendation for open-source sharing of both methods and benchmarks.

Journal Article

Share this book

Add to My Shelf

Chromosome evolution and the genetic basis of agronomically important traits in greater yam

by Kariba, Robert , Bredeson, Jessen V. , Nwadili, Christian O. in 14/63 , 45/22 , 45/23

2022

The nutrient-rich tubers of the greater yam, Dioscorea alata L., provide food and income security for millions of people around the world. Despite its global importance, however, greater yam remains an orphan crop. Here, we address this resource gap by presenting a highly contiguous chromosome-scale genome assembly of D. alata combined with a dense genetic map derived from African breeding populations. The genome sequence reveals an ancient allotetraploidization in the Dioscorea lineage, followed by extensive genome-wide reorganization. Using the genomic tools, we find quantitative trait loci for resistance to anthracnose, a damaging fungal pathogen of yam, and several tuber quality traits. Genomic analysis of breeding lines reveals both extensive inbreeding as well as regions of extensive heterozygosity that may represent interspecific introgression during domestication. These tools and insights will enable yam breeders to unlock the potential of this staple crop and take full advantage of its adaptability to varied environments. While greater yam provides food and income security for millions of people around the world, there are limited genomic resources available. Here, the authors report a chromosome-scale assembly of the greater yam genome as well as quantitative trait loci associated with anthracnose resistance and tuber traits.

Journal Article

Share this book

Add to My Shelf

2passtools: two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing

by Knop, Katarzyna , Parker, Matthew T. , Barton, Geoffrey J. in Accuracy , Algorithms , Animal Genetics and Genomics

2021

Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long reads reveals the true complexity of processing. However, the relatively high error rates of long-read sequencing technologies can reduce the accuracy of intron identification. Here we apply alignment metrics and machine-learning-derived sequence information to filter spurious splice junctions from long-read alignments and use the remaining junctions to guide realignment in a two-pass approach. This method, available in the software package 2passtools ( https://github.com/bartongroup/2passtools ), improves the accuracy of spliced alignment and transcriptome assembly for species both with and without existing high-quality annotations.

Journal Article

Share this book

Add to My Shelf

m6A modification of U6 snRNA modulates usage of two major classes of pre-mRNA 5’ splice site

by Breidenbach, Friedrich , Fica, Sebastian M , Davies, Brendan H in ambient temperature , Chromosomes and Gene Expression , epitranscriptome

2022

Alternative splicing of messenger RNAs is associated with the evolution of developmentally complex eukaryotes. Splicing is mediated by the spliceosome, and docking of the pre-mRNA 5’ splice site into the spliceosome active site depends upon pairing with the conserved ACAGA sequence of U6 snRNA. In some species, including humans, the central adenosine of the AC A GA box is modified by N 6 methylation, but the role of this m 6 A modification is poorly understood. Here, we show that m 6 A modified U6 snRNA determines the accuracy and efficiency of splicing. We reveal that the conserved methyltransferase, FIONA1, is required for Arabidopsis U6 snRNA m 6 A modification. Arabidopsis fio1 mutants show disrupted patterns of splicing that can be explained by the sequence composition of 5’ splice sites and cooperative roles for U5 and U6 snRNA in splice site selection. U6 snRNA m 6 A influences 3’ splice site usage. We generalise these findings to reveal two major classes of 5’ splice site in diverse eukaryotes, which display anti-correlated interaction potential with U5 snRNA loop 1 and the U6 snRNA AC A GA box. We conclude that U6 snRNA m 6 A modification contributes to the selection of degenerate 5’ splice sites crucial to alternative splicing. All the information necessary to build the proteins that perform the biological processes required for life is encoded in the DNA of an organism. Making these proteins requires the DNA sequence of a gene to be transcribed into a ‘messenger RNA’ (mRNA), which is then processed into a final, mature form. This blueprint is then translated to assemble the corresponding protein. When an mRNA is processed, segments of the sequence that do not code for protein are removed and the remaining coding sequences are joined together in the right order. An intricate molecular machine known as the spliceosome controls this mechanism by recognising the ‘splice sites’ where coding and non-coding sequences meet. Depending on external conditions, the spliceosome can ‘pick-and-mix’ the coding sequences to create different processed mRNAs (and therefore proteins) from a single gene. This alternative splicing mechanism is often used to regulate when certain biological processes take place based on environmental cues; for example, the splicing of genes which control the timing of plant flowering is sensitive to ambient temperatures. To investigate this mechanism, Parker et al. focused on Arabidopsis thaliana , a plant that blooms later when temperatures are low. This precise timing partly relies on a gene whose mRNA is efficiently spliced in the cold, resulting in an active form of its protein that blocks blooming. Parker et al. grew and screened many A. thaliana plants to find individuals that could flower early in the cold, in which splicing of this gene was disrupted. A mutant fitting these criteria was identified and subjected to further investigation, which revealed that it could not produce FIONA1. In non-mutant plants, this enzyme chemically modifies one of the components of the spliceosome, a small nuclear RNA known as U6. Parker et al found that there are two types of splice site – one more likely to interact with U6 and another that preferentially interacts with another small nuclear RNA, U5. When FIONA1 is inactive (such as in the mutant identified by Parker et al.), splice sites that tend to strongly interact with U5 are selected. However, when the enzyme is active, splice sites that tend to bind with the chemically modified U6 are used instead. Further work by Parker et al. showed that these two types of splice sites (‘preferring’ either U5 or U6) are found in equal proportions in the genomes of many species, including humans. This suggests that Parker et al. have uncovered an essential feature of how genomes are organised and splicing is controlled.

Journal Article

Share this book

Add to My Shelf

Disruption of the mRNA m6A writer complex triggers autoimmunity in Arabidopsis

by Metheringham, Carey L. , Maji, Ankita , Parker, Matthew T. in Arabidopsis - genetics , Arabidopsis - immunology , Arabidopsis Proteins - genetics

2025

Distinguishing self from non-self is crucial to direct immune responses against pathogens. Unmodified RNAs stimulate human innate immunity, but RNA modifications suppress this response. mRNA m 6 A modification is essential for Arabidopsis thaliana viability. However, the molecular basis of the impact of mRNA m 6 A depletion is poorly understood. Here, we show that disruption of the Arabidopsis mRNA m 6 A writer complex triggers autoimmunity. Most gene expression changes in m 6 A writer complex vir-1 mutants grown at 17°C are explained by defence gene activation and are suppressed at 27°C, consistent with the frequent temperature sensitivity of Arabidopsis immunity. Accordingly, we found enhanced pathogen resistance and increased premature cell death in vir-1 mutants at 17°C but not 27°C. Global temperature-sensitive mRNA poly(A) tail length changes accompany these phenotypes. Our results demonstrate that autoimmunity is a major phenotype of mRNA m 6 A writer complex mutants, with important implications for interpreting the role of this modification. Furthermore, we open the broader question of whether unmodified RNA triggers immune signalling in plants.

Journal Article

Share this book

Add to My Shelf

Transcription Termination and Chimeric RNA Formation Controlled by Arabidopsis thaliana FPA

by Sherstnev, Alexander , Cole, Christian , Barton, Geoffrey J. in Alternative Splicing - genetics , Arabidopsis - genetics , Arabidopsis Proteins - biosynthesis

2013

Alternative cleavage and polyadenylation influence the coding and regulatory potential of mRNAs and where transcription termination occurs. Although widespread, few regulators of this process are known. The Arabidopsis thaliana protein FPA is a rare example of a trans-acting regulator of poly(A) site choice. Analysing fpa mutants therefore provides an opportunity to reveal generic consequences of disrupting this process. We used direct RNA sequencing to quantify shifts in RNA 3' formation in fpa mutants. Here we show that specific chimeric RNAs formed between the exons of otherwise separate genes are a striking consequence of loss of FPA function. We define intergenic read-through transcripts resulting from defective RNA 3' end formation in fpa mutants and detail cryptic splicing and antisense transcription associated with these read-through RNAs. We identify alternative polyadenylation within introns that is sensitive to FPA and show FPA-dependent shifts in IBM1 poly(A) site selection that differ from those recently defined in mutants defective in intragenic heterochromatin and DNA methylation. Finally, we show that defective termination at specific loci in fpa mutants is shared with dicer-like 1 (dcl1) or dcl4 mutants, leading us to develop alternative explanations for some silencing roles of these proteins. We relate our findings to the impact that altered patterns of 3' end formation can have on gene and genome organisation.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter