Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
16
result(s) for
"Krishnakumar, Vivek"
Sort by:
ePlant
2017
A big challenge in current systems biology research arises when different types of data must be accessed from separate sources and visualized using separate tools. The high cognitive load required to navigate such a workflow is detrimental to hypothesis generation. Accordingly, there is a need for a robust research platform that incorporates all data and provides integrated search, analysis, and visualization features through a single portal. Here, we present ePlant (http://bar.utoronto.ca/eplant), a visual analytic tool for exploring multiple levels of Arabidopsis thaliana data through a zoomable user interface. ePlant connects to several publicly available web services to download genome, proteome, interactome, transcriptome, and 3D molecular structure data for one or more genes or gene products of interest. Data are displayed with a set of visualization tools that are presented using a conceptual hierarchy from big to small, and many of the tools combine information from more than one data type. We describe the development of ePlant in this article and present several examples illustrating its integrative features for hypothesis generation. We also describe the process of deploying ePlant as an “app” on Araport. Building on readily available web services, the code for ePlant is freely available for any other biological species research.
Journal Article
JCVI: A versatile toolkit for comparative genomics analysis
2024
The life cycle of genome builds spans interlocking pillars of assembly, annotation, and comparative genomics to drive biological insights. While tools exist to address each pillar separately, there is a growing need for tools to integrate different pillars of a genome project holistically. For example, comparative approaches can provide quality control of assembly or annotation; genome assembly, in turn, can help to identify artifacts that may complicate the interpretation of genome comparisons. The JCVI library is a versatile Python‐based library that offers a suite of tools that excel across these pillars. Featuring a modular design, the JCVI library provides high‐level utilities for tasks such as format parsing, graphics generation, and manipulation of genome assemblies and annotations. Supporting genomics algorithms like MCscan and ALLMAPS are widely employed in building genome releases, producing publication‐ready figures for quality assessment and evolutionary inference. Developed and maintained collaboratively, the JCVI library emphasizes quality and reusability. The JCVI library contains a set of computational tools that are often used in tasks covering genome assembly, annotation, and comparative genomics. Engineered with a focus on versatility, the library incorporates modules for algorithms, format parsing, and graphics generation, enabling seamless integration into diverse research workflows. Highlights JCVI is a Python‐based library that enables genomic workflows through a collection of simple reusable tools. The JCVI library is modular with basic functionalities separated into bioinformatics format parsing, assembly and annotation‐related tools, comparative genomics, and graphics generation. Embedded algorithms like MCscan, ALLMAPS, and other tools within JCVI are now widely used in the community and power a wide array of use cases.
Journal Article
Hidden genomic evolution in a morphospecies—The landscape of rapidly evolving genes in Tetrahymena
2019
A morphospecies is defined as a taxonomic species based wholly on morphology, but often morphospecies consist of clusters of cryptic species that can be identified genetically or molecularly. The nature of the evolutionary novelty that accompanies speciation in a morphospecies is an intriguing question. Morphospecies are particularly common among ciliates, a group of unicellular eukaryotes that separates 2 kinds of nuclei-the silenced germline nucleus (micronucleus [MIC]) and the actively expressed somatic nucleus (macronucleus [MAC])-within a common cytoplasm. Because of their very similar morphologies, members of the Tetrahymena genus are considered a morphospecies. We explored the hidden genomic evolution within this genus by performing a comprehensive comparative analysis of the somatic genomes of 10 species and the germline genomes of 2 species of Tetrahymena. These species show high genetic divergence; phylogenomic analysis suggests that the genus originated about 300 million years ago (Mya). Seven universal protein domains are preferentially included among the species-specific (i.e., the youngest) Tetrahymena genes. In particular, leucine-rich repeat (LRR) genes make the largest contribution to the high level of genome divergence of the 10 species. LRR genes can be sorted into 3 different age groups. Parallel evolutionary trajectories have independently occurred among LRR genes in the different Tetrahymena species. Thousands of young LRR genes contain tandem arrays of exactly 90-bp exons. The introns separating these exons show a unique, extreme phase 2 bias, suggesting a clonal origin and successive expansions of 90-bp-exon LRR genes. Identifying LRR gene age groups allowed us to document a Tetrahymena intron length cycle. The youngest 90-bp exon LRR genes in T. thermophila are concentrated in pericentromeric and subtelomeric regions of the 5 micronuclear chromosomes, suggesting that these regions act as genome innovation centers. Copies of a Tetrahymena Long interspersed element (LINE)-like retrotransposon are very frequently found physically adjacent to 90-bp exon/intron repeat units of the youngest LRR genes. We propose that Tetrahymena species have used a massive exon-shuffling mechanism, involving unequal crossing over possibly in concert with retrotransposition, to create the unique 90-bp exon array LRR genes.
Journal Article
Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome
by
Zeng, Qiandong
,
Pritham, Ellen J
,
Feschotte, Cédric
in
Adaptation
,
centromere
,
chromosome breakage
2016
The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymena’s germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum.
Journal Article
Polyribosomal RNA-Seq Reveals the Decreased Complexity and Diversity of the Arabidopsis Translatome
by
Krishnakumar, Vivek
,
Zhang, Xingtan
,
Tang, Haibao
in
Abundance
,
Alternative splicing
,
Alternative Splicing - genetics
2015
Recent RNA-seq studies reveal that the transcriptomes in animals and plants are more complex than previously thought, leading to the inclusion of many more splice isoforms in annotated genomes. However, it is possible that a significant proportion of the transcripts are spurious isoforms that do not contribute to functional proteins. One of the current hypotheses is that commonly used mRNA extraction methods isolate both pre-mature (nuclear) mRNA and mature (cytoplasmic) mRNA, and these incompletely spliced pre-mature mRNAs may contribute to a large proportion of these spurious transcripts. To investigate this, we compared a traditional RNA-seq dataset (total RNA-seq) and a ribosome-bound RNA-seq dataset (polyribosomal RNA-seq) from Arabidopsis thaliana. An integrative framework that combined de novo assembly and genome-guided assembly was applied to reconstruct transcriptomes for the two datasets. Up to 44.8% of the de novo assembled transcripts in total RNA-seq sample were of low abundance, whereas only 0.09% in polyribosomal RNA-seq de novo assembly were of low abundance. The final round of assembly using PASA (Program to Assemble Spliced Alignments) resulted in more transcript assemblies in the total RNA-seq than those in polyribosomal sample. Comparison of alternative splicing (AS) patterns between total and polyribosomal RNA-seq showed a significant difference (G-test, p-value<0.01) in intron retention events: 46.4% of AS events in the total sample were intron retention, whereas only 23.5% showed evidence of intron retention in the polyribosomal sample. It is likely that a large proportion of retained introns in total RNA-seq result from incompletely spliced pre-mature mRNA. Overall, this study demonstrated that polyribosomal RNA-seq technology decreased the complexity and diversity of the coding transcriptome by eliminating pre-mature mRNAs, especially those of low abundance.
Journal Article
An improved genome release (version Mt4.0) for the model legume Medicago truncatula
by
Chan, Agnes
,
Yandell, Mark
,
Krishnakumar, Vivek
in
Alfalfa
,
Analysis
,
Animal Genetics and Genomics
2014
Background
Medicago truncatula
, a close relative of alfalfa, is a preeminent model for studying nitrogen fixation, symbiosis, and legume genomics. The Medicago sequencing project began in 2003 with the goal to decipher sequences originated from the euchromatic portion of the genome. The initial sequencing approach was based on a BAC tiling path, culminating in a BAC-based assembly (Mt3.5) as well as an in-depth analysis of the genome published in 2011.
Results
Here we describe a further improved and refined version of the
M. truncatula
genome (Mt4.0) based on
de novo
whole genome shotgun assembly of a majority of Illumina and 454 reads using ALLPATHS-LG. The ALLPATHS-LG scaffolds were anchored onto the pseudomolecules on the basis of alignments to both the optical map and the genotyping-by-sequencing (GBS) map. The Mt4.0 pseudomolecules encompass ~360 Mb of actual sequences spanning 390 Mb of which ~330 Mb align perfectly with the optical map, presenting a drastic improvement over the BAC-based Mt3.5 which only contained 70% sequences (~250 Mb) of the current version. Most of the sequences and genes that previously resided on the unanchored portion of Mt3.5 have now been incorporated into the Mt4.0 pseudomolecules, with the exception of ~28 Mb of unplaced sequences. With regard to gene annotation, the genome has been re-annotated through our gene prediction pipeline, which integrates EST, RNA-seq, protein and gene prediction evidences. A total of 50,894 genes (31,661 high confidence and 19,233 low confidence) are included in Mt4.0 which overlapped with ~82% of the gene loci annotated in Mt3.5. Of the remaining genes, 14% of the Mt3.5 genes have been deprecated to an “unsupported” status and 4% are absent from the Mt4.0 predictions.
Conclusions
Mt4.0 and its associated resources, such as genome browsers, BLAST-able datasets and gene information pages, can be found on the JCVI Medicago web site (
http://www.jcvi.org/medicago
). The assembly and annotation has been deposited in GenBank (BioProject: PRJNA10791). The heavily curated chromosomal sequences and associated gene models of Medicago will serve as a better reference for legume biology and comparative genomics.
Journal Article
MarkerMiner 1.0: A New Application for Phylogenetic Marker Development Using Angiosperm Transcriptomes
by
Barbazuk, W. Brad
,
Soltis, Pamela S.
,
Jordon-Thaden, Ingrid E.
in
Angiospermae
,
Angiosperms
,
Automation
2015
Premise of the study: Targeted sequencing using next-generation sequencing (NGS) platforms offers enormous potential for plant systematics by enabling economical acquisition of multilocus data sets that can resolve difficult phylogenetic problems. However, because discovery of single-copy nuclear (SCN) loci from NGS data requires both bioinformatics skills and access to high-performance computing resources, the application of NGS data has been limited. Methods and Results: We developed MarkerMiner 1.0, a fully automated, open-access bioinformatic workflow and application for discovery of SCN loci in angiosperms. Our new tool identified as many as 1993 SCN loci from transcriptomic data sampled as part of four independent test cases representing marker development projects at different phylogenetic scales. Conclusions: MarkerMiner is an easy-to-use and effective tool for discovery of putative SCN loci. It can be run locally or via the Web, and its tabular and alignment outputs facilitate efficient downstream assessments of phylogenetic utility, locus selection, intron-exon boundary prediction, and primer or probe development.
Journal Article
Polyribosomal RNA-Seq Reveals the Decreased Complexity and Diversity of the Arabidopsis Translatome: e0117699
2015
Recent RNA-seq studies reveal that the transcriptomes in animals and plants are more complex than previously thought, leading to the inclusion of many more splice isoforms in annotated genomes. However, it is possible that a significant proportion of the transcripts are spurious isoforms that do not contribute to functional proteins. One of the current hypotheses is that commonly used mRNA extraction methods isolate both pre-mature (nuclear) mRNA and mature (cytoplasmic) mRNA, and these incompletely spliced pre-mature mRNAs may contribute to a large proportion of these spurious transcripts. To investigate this, we compared a traditional RNA-seq dataset (total RNA-seq) and a ribosome-bound RNA-seq dataset (polyribosomal RNA-seq) from Arabidopsis thaliana. An integrative framework that combined de novo assembly and genome-guided assembly was applied to reconstruct transcriptomes for the two datasets. Up to 44.8% of the de novo assembled transcripts in total RNA-seq sample were of low abundance, whereas only 0.09% in polyribosomal RNA-seq de novo assembly were of low abundance. The final round of assembly using PASA (Program to Assemble Spliced Alignments) resulted in more transcript assemblies in the total RNA-seq than those in polyribosomal sample. Comparison of alternative splicing (AS) patterns between total and polyribosomal RNA-seq showed a significant difference (G-test, p-value<0.01) in intron retention events: 46.4% of AS events in the total sample were intron retention, whereas only 23.5% showed evidence of intron retention in the polyribosomal sample. It is likely that a large proportion of retained introns in total RNA-seq result from incompletely spliced pre-mature mRNA. Overall, this study demonstrated that polyribosomal RNA-seq technology decreased the complexity and diversity of the coding transcriptome by eliminating pre-mature mRNAs, especially those of low abundance.
Journal Article
Araport11: a complete reannotation of the Arabidopsis thaliana reference genome
2016
The flowering plant Arabidopsis thaliana is a dicot model organism for research in many aspects of plant biology. A comprehensive annotation of its genome paves the way for understanding the functions and activities of all types of transcripts, including mRNA, noncoding RNA, and small RNA. The most recent annotation update (TAIR10) released more than five years ago had a profound impact on Arabidopsis research. Maintaining the accuracy of the annotation continues to be a prerequisite for future progress. Using an integrative annotation pipeline, we assembled tissue-specific RNA-seq libraries from 113 datasets and constructed 48,359 transcript models of protein-coding genes in eleven tissues. In addition, we annotated various classes of noncoding RNA including small RNA, long intergenic RNA, small nucleolar RNA, natural antisense transcript, small nuclear RNA, and microRNA using published datasets and in-house analytic results. Altogether, we identified 738 novel protein-coding genes, 508 novel transcribed regions, 5,051 non-coding genes, and 35,846 small-RNA loci that formerly eluded annotation. Analysis on the splicing events and RNA-seq based expression profile revealed the landscapes of gene structures, untranslated regions, and splicing activities to be more intricate than previously appreciated. We also present 692 uniformly expressed housekeeping genes, 43% of whose human orthologs are also housekeeping genes. This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further our understanding of the biological processes of this plant model but also of other species.
DRASTIC model developed with lineament density to map groundwater susceptibility: a case study in part of Coimbatore district, Tamilnadu, India
by
Sivakumar, Vivek
,
Subramanian, Krishnakumar
,
Sreevidya, V
in
Agricultural expansion
,
Case studies
,
Classification
2023
Groundwater has never been relied on as much as it has been in sections of Coimbatore in the previous two decades due to fast and unplanned urbanisation, industrial, and agricultural expansion. This report seeks to give local and regional planning authorities with a brief groundwater vulnerability assessment of the section of Coimbatore region in order to guarantee more sustainable growth in the area. The part of Coimbatore region, which covers Pollachi, Sulur and Coimbatore south in eastern part of Tamilnadu, is the study’s focal point. The conventional DRASTIC model is used to map the initial groundwater vulnerability assessment. It is then altered by adding “lineament density index” to the original seven DRASTIC features, due to the previously established high correlation between groundwater flow and yield and lineament. The final drastic index map was classed into five categories: very low, low, medium, high, and very high vulnerability, with each covering an area of 840.26, 732.58, 583.46, 183.17, and 167.13 km2. The modified drastic index is classified the same five categories, with 965.35, 880.14, 399.21, 158.29, and 103.54 km2 of land covered in each. Alanthurai, Thodamuthur, Pichanur, and Pollachi are among the most susceptible places in the state. This groundwater vulnerability map will be used to aid in groundwater pollution management and planning.
Journal Article