Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
56,032 result(s) for "Gene Function"
Sort by:
NaviGO: interactive tool for visualization and functional similarity and coherence analysis with gene ontology
Background The number of genomics and proteomics experiments is growing rapidly, producing an ever-increasing amount of data that are awaiting functional interpretation. A number of function prediction algorithms were developed and improved to enable fast and automatic function annotation. With the well-defined structure and manual curation, Gene Ontology (GO) is the most frequently used vocabulary for representing gene functions. To understand relationship and similarity between GO annotations of genes, it is important to have a convenient pipeline that quantifies and visualizes the GO function analyses in a systematic fashion. Results NaviGO is a web-based tool for interactive visualization, retrieval, and computation of functional similarity and associations of GO terms and genes. Similarity of GO terms and gene functions is quantified with six different scores including protein-protein interaction and context based association scores we have developed in our previous works. Interactive navigation of the GO function space provides intuitive and effective real-time visualization of functional groupings of GO terms and genes as well as statistical analysis of enriched functions. Conclusions We developed NaviGO, which visualizes and analyses functional similarity and associations of GO terms and genes. The NaviGO webserver is freely available at: http://kiharalab.org/web/navigo .
About the dark corners in the gene function space of Escherichia coli remaining without illumination by scientific literature
Background Although Escherichia coli ( E. coli ) is the most studied prokaryote organism in the history of life sciences, many molecular mechanisms and gene functions encoded in its genome remain to be discovered. This work aims at quantifying the illumination of the E. coli gene function space by the scientific literature and how close we are towards the goal of a complete list of E. coli gene functions. Results The scientific literature about E. coli protein-coding genes has been mapped onto the genome via the mentioning of names for genomic regions in scientific articles both for the case of the strain K-12 MG1655 as well as for the 95%-threshold softcore genome of 1324 E. coli strains with known complete genome. The article match was quantified with the ratio of a given gene name’s occurrence to the mentioning of any gene names in the paper. The various genome regions have an extremely uneven literature coverage. A group of elite genes with ≥ 100 full publication equivalents (FPEs, FPE = 1 is an idealized publication devoted to just a single gene) attracts the lion share of the papers. For K-12, ~ 65% of the literature covers just 342 elite genes; for the softcore genome, ~ 68% of the FPEs is about only 342 elite gene families (GFs). We also find that most genes/GFs have at least one mentioning in a dedicated scientific article (with the exception of at least 137 protein-coding transcripts for K-12 and 26 GFs from the softcore genome). Whereas the literature growth rates were highest for uncharacterized or understudied genes until 2005–2010 compared with other groups of genes, they became negative thereafter. At the same time, literature for anyhow well-studied genes started to grow explosively with threshold T10 (≥ 10 FPEs). Typically, a body of ~ 20 actual articles generated over ~ 15 years of research effort was necessary to reach T10. Lineage-specific co-occurrence analysis of genes belonging to the accessory genome of E. coli together with genomic co-localization and sequence-analytic exploration hints previously completely uncharacterized genes yahV and yddL being associated with osmotic stress response/motility mechanisms. Conclusion If the numbers of scientific articles about uncharacterized and understudied genes remain at least at present levels, full gene function lists for the strain K-12 MG1655 and the E. coli softcore genome are in reach within the next 25–30 years. Once the literature body for a gene crosses 10 FPEs, most of the critical fundamental research risk appears overcome and steady incremental research becomes possible.
Screens in fly and beetle reveal vastly divergent gene sets required for developmental processes
Background Most of the known genes required for developmental processes have been identified by genetic screens in a few well-studied model organisms, which have been considered representative of related species, and informative—to some degree—for human biology. The fruit fly Drosophila melanogaster is a prime model for insect genetics, and while conservation of many gene functions has been observed among bilaterian animals, a plethora of data show evolutionary divergence of gene function among more closely-related groups, such as within the insects. A quantification of conservation versus divergence of gene functions has been missing, without which it is unclear how representative data from model systems actually are. Results Here, we systematically compare the gene sets required for a number of homologous but divergent developmental processes between fly and beetle in order to quantify the difference of the gene sets. To that end, we expanded our RNAi screen in the red flour beetle Tribolium castaneum to cover more than half of the protein-coding genes. Then we compared the gene sets required for four different developmental processes between beetle and fly. We found that around 50% of the gene functions were identified in the screens of both species while for the rest, phenotypes were revealed only in fly (~ 10%) or beetle (~ 40%) reflecting both technical and biological differences. Accordingly, we were able to annotate novel developmental GO terms for 96 genes studied in this work. With this work, we publish the final dataset for the pupal injection screen of the iBeetle screen reaching a coverage of 87% (13,020 genes). Conclusions We conclude that the gene sets required for a homologous process diverge more than widely believed. Hence, the insights gained in flies may be less representative for insects or protostomes than previously thought, and work in complementary model systems is required to gain a comprehensive picture. The RNAi screening resources developed in this project, the expanding transgenic toolkit, and our large-scale functional data make T. castaneum an excellent model system in that endeavor.
Did the early full genome sequencing of yeast boost gene function discovery?
Background Although the genome of Saccharomyces cerevisiae ( S. cerevisiae ) was the first one of a eukaryote organism that was fully sequenced (in 1996), a complete understanding of the potential of encoded biomolecular mechanisms has not yet been achieved. Here, we wish to quantify how far the goal of a full list of S. cerevisiae gene functions still is. Results The scientific literature about S. cerevisiae protein-coding genes has been mapped onto the yeast genome via the mentioning of names for genomic regions in scientific publications. The match was quantified with the ratio of a given gene name’s occurrences to those of any gene names in the article. We find that ~ 230 elite genes with ≥ 75 full publication equivalents (FPEs, FPE = 1 is an idealized publication referring to just a single gene) command ~ 45% of all literature. At the same time, about two thirds of the genes (each with less than 10 FPEs) are described in just 12% of the literature (in average each such gene has just ~ 1.5% of the literature of an elite gene). About 600 genes have not been mentioned in any dedicated article. Compared with other groups of genes, the literature growth rates were highest for uncharacterized or understudied genes until late nineties of the twentieth century. Yet, these growth rates deteriorated and became negative thereafter. Thus, yeast function discovery for previously uncharacterized genes has returned to the level of ~ 1980. At the same time, literature for anyhow well-studied genes (with a threshold T10 (≥ 10 FPEs) and higher) remains steadily growing. Conclusions Did the early full genome sequencing of yeast boost gene function discovery? The data proves that the moment of publishing the full genome in reality coincides with the onset of decline of gene function discovery for previously uncharacterized genes. If the current status of literature about yeast molecular mechanisms can be extrapolated into the future, it will take about another ~ 50 years to complete the yeast gene function list. We found that a small group of scientific journals contributed extraordinarily to publishing early reports relevant to yeast gene function discoveries.
Elucidating gene function and function evolution through comparison of co-expression networks of plants
The analysis of gene expression data has shown that transcriptionally coordinated (co-expressed) genes are often functionally related, enabling scientists to use expression data in gene function prediction. This Focused Review discusses our original paper (Large-scale co-expression approach to dissect secondary cell wall formation across plant species, Frontiers in Plant Science 2:23). In this paper we applied cross-species analysis to co-expression networks of genes involved in cellulose biosynthesis. We showed that the co-expression networks from different species are highly similar, indicating that whole biological pathways are conserved across species. This finding has two important implications. First, the analysis can transfer gene function annotation from well-studied plants, such as Arabidopsis, to other, uncharacterized plant species. As the analysis finds genes that have similar sequence and similar expression pattern across different organisms, functionally equivalent genes can be identified. Second, since co-expression analyses are often noisy, a comparative analysis should have higher performance, as parts of co-expression networks that are conserved are more likely to be functionally relevant. In this Focused Review, we outline the comparative analysis done in the original paper and comment on the recent advances and approaches that allow comparative analyses of co-function networks. We hypothesize that in comparison to simple co-expression analysis, comparative analysis would yield more accurate gene function predictions. Finally, by combining comparative analysis with genomic information of green plants, we propose a possible composition of cellulose biosynthesis machinery during earlier stages of plant evolution.
PlantGPT: An Arabidopsis‐Based Intelligent Agent that Answers Questions about Plant Functional Genomics
Research into plant gene function is crucial for developing strategies to increase crop yields. The recent introduction of large language models (LLMs) offers a means to aggregate large amounts of data into a queryable format, but the output can contain inaccurate or false claims known as hallucinations. To minimize such hallucinations and produce high‐quality knowledge‐based outputs, the s of over 60 000 plant research articles are compiled into a Chroma database for retrieval‐augmented generation (RAG). Then linguistic data are used from 13 993 Arabidopsis (Arabidopsis thaliana) phenotypes and 23 323 gene functions to fine‐tune the LLM Llama3‐8B, producing PlantGPT, a virtual expert in Arabidopsis phenotype–gene research. By evaluating answers to test questions, it is demonstrated that PlantGPT outperforms general LLMs in answering specialized questions. The findings provide a blueprint for functional genomics research in food crops and demonstrate the potential for developing LLMs for plant research modalities. To provide broader access and facilitate adoption, the online tool http://www.plantgpt.icu is developed, which will allow researchers to use PlantGPT in their scientific investigations. PlantGPT integrates 60 000+ plant research articles with Arabidopsis phenotype‐gene data through retrieval‐augmented generation and fine‐tuning of Llama3‐8B. This open‐source, specialized AI system outperforms general large language models in plant gene‐phenotype relationships, establishing a new paradigm for functional genomics research and molecular design breeding.
Reconstitution and Transmission of Gut Microbiomes and Their Genes between Generations
Microbiomes are transmitted between generations by a variety of different vertical and/or horizontal modes, including vegetative reproduction (vertical), via female germ cells (vertical), coprophagy and regurgitation (vertical and horizontal), physical contact starting at birth (vertical and horizontal), breast-feeding (vertical), and via the environment (horizontal). Analyses of vertical transmission can result in false negatives (failure to detect rare microbes) and false positives (strain variants). In humans, offspring receive most of their initial gut microbiota vertically from mothers during birth, via breast-feeding and close contact. Horizontal transmission is common in marine organisms and involves selectivity in determining which environmental microbes can colonize the organism’s microbiome. The following arguments are put forth concerning accurate microbial transmission: First, the transmission may be of functions, not necessarily of species; second, horizontal transmission may be as accurate as vertical transmission; third, detection techniques may fail to detect rare microbes; lastly, microbiomes develop and reach maturity with their hosts. In spite of the great variation in means of transmission discussed in this paper, microbiomes and their functions are transferred from one generation of holobionts to the next with fidelity. This provides a strong basis for each holobiont to be considered a unique biological entity and a level of selection in evolution, largely maintaining the uniqueness of the entity and conserving the species from one generation to the next.
Proxies of CRISPR/Cas9 Activity To Aid in the Identification of Mutagenized Arabidopsis Plants
CRISPR/Cas9 has become the preferred gene-editing technology to obtain loss-of-function mutants in plants, and hence a valuable tool to study gene function. This is mainly due to the easy reprogramming of Cas9 specificity using customizable small non-coding RNAs, and to the possibility of editing several independent genes simultaneously. Despite these advances, the identification of CRISPR-edited plants remains time and resource-intensive. Here, based on the premise that one editing event in one locus is a good predictor of editing event/s in other locus/loci, we developed a CRISPR co-editing selection strategy that greatly facilitates the identification of CRISPR-mutagenized Arabidopsis thaliana plants. This strategy is based on targeting the gene/s of interest simultaneously with a proxy of CRISPR-Cas9-directed mutagenesis. The proxy is an endogenous gene whose loss-of-function produces an easy-to-detect visible phenotype that is unrelated to the expected phenotype of the gene/s under study. We tested this strategy via assessing the frequency of co-editing of three functionally unrelated proxy genes. We found that each proxy predicted the occurrence of mutations in each surrogate gene with efficiencies ranging from 68 to 100%. The selection strategy laid out here provides a framework to facilitate the identification of multiplex edited plants, thus aiding in the study of gene function when functional redundancy hinders the effort to define gene-function-phenotype links.
GND-PCA Method for Identification of Gene Functions Involved in Asymmetric Division of C. elegans
Due to the rapid development of imaging technology, a large number of biological images have been obtained with three-dimensional (3D) spatial information, time information, and spectral information. Compared with the case of two-dimensional images, the framework for analyzing multidimensional bioimages has not been completely established yet. WDDD is an open biological image database. It dynamically records 3D developmental images of 186 samples of nematodes C. elegans. In this study, based on WDDD, we constructed a framework to analyze the multidimensional dataset, which includes image segmentation, image registration, size registration by the length of main axes, time registration by extracting key time points, and finally, using generalized N-dimensional principal component analysis (GND-PCA) to analyze the phenotypes of bioimages. As a data-driven technique, GND-PCA can automatically extract the important factors involved in the development of P1 and AB in C. elegans. A 3D bioimage can be regarded as a third-order tensor. Therefore, GND-PCA was applied to the set of third-order tensors, and a set of third-order tensor bases was iteratively learned to linearly approximate the set. For each tensor base, a corresponding characteristic image is built to reveal its geometric meaning. The results show that different bases can be used to express different vital factors in development, such as the asymmetric division within the two-cell stage of C. elegans. Based on selected bases, statistical models were built by 50 wild-type (WT) embryos in WDDD, and were applied to RNA interference (RNAi) embryos. The results of statistical testing demonstrated the effectiveness of this method.
Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression
Trait-associated genetic variants affect complex phenotypes primarily via regulatory mechanisms on the transcriptome. To investigate the genetics of gene expression, we performed cis - and trans -expression quantitative trait locus (eQTL) analyses using blood-derived expression from 31,684 individuals through the eQTLGen Consortium. We detected cis -eQTL for 88% of genes, and these were replicable in numerous tissues. Distal trans -eQTL (detected for 37% of 10,317 trait-associated variants tested) showed lower replication rates, partially due to low replication power and confounding by cell type composition. However, replication analyses in single-cell RNA-seq data prioritized intracellular trans -eQTL. Trans -eQTL exerted their effects via several mechanisms, primarily through regulation by transcription factors. Expression of 13% of the genes correlated with polygenic scores for 1,263 phenotypes, pinpointing potential drivers for those traits. In summary, this work represents a large eQTL resource, and its results serve as a starting point for in-depth interpretation of complex phenotypes. Analyses of expression profiles from whole blood of 31,684 individuals identify cis -expression quantitative trait loci (eQTL) effects for 88% of genes and trans -eQTL effects for 37% of trait-associated variants.