Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
35
result(s) for
"Mi, Huaiyu"
Sort by:
Protocol Update for large-scale genome and gene function analysis with the PANTHER classification system (v.14.0)
2019
The PANTHER classification system (http://www.pantherdb.org) is a comprehensive system that combines genomes, gene function classifications, pathways and statistical analysis tools to enable biologists to analyze large-scale genome-wide experimental data. The current system (PANTHER v.14.0) covers 131 complete genomes organized into gene families and subfamilies; evolutionary relationships between genes are represented in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models (HMMs)). The families and subfamilies are annotated with Gene Ontology (GO) terms, and sequences are assigned to PANTHER pathways. A suite of tools has been built to allow users to browse and query gene functions and analyze large-scale experimental data with a number of statistical tests. PANTHER is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. Since the protocol for using this tool (v.8.0) was originally published in 2013, there have been substantial improvements and updates in the areas of data quality, data coverage, statistical algorithms and user experience. This Protocol Update provides detailed instructions on how to analyze genome-wide experimental data in the PANTHER classification system.Here the authors provide an update to their 2013 protocol for using the PANTHER classification system, detailing how to analyze genome-wide experimental data with the newest version of PANTHER (v.14.0), with improvements in the areas of data quality, data coverage, statistical algorithms and user experience.
Journal Article
PEREGRINE: A genome-wide prediction of enhancer to gene relationships supported by experimental evidence
by
Mills, Caitlin
,
Muruganujan, Anushya
,
Marconett, Crystal N.
in
Biology and life sciences
,
Cell Line
,
Databases, Genetic
2020
Enhancers are powerful and versatile agents of cell-type specific gene regulation, which are thought to play key roles in human disease. Enhancers are short DNA elements that function primarily as clusters of transcription factor binding sites that are spatially coordinated to regulate expression of one or more specific target genes. These regulatory connections between enhancers and target genes can therefore be characterized as enhancer-gene links that can affect development, disease, and homeostatic cellular processes. Despite their implication in disease and the establishment of cell identity during development, most enhancer-gene links remain unknown. Here we introduce a new, publicly accessible database of predicted enhancer-gene links, PEREGRINE. The PEREGRINE human enhancer-gene links interactive web interface incorporates publicly available experimental data from ChIA-PET, eQTL, and Hi-C assays across 78 cell and tissue types to link 449,627 enhancers to 17,643 protein-coding genes. These enhancer-gene links are made available through the new Enhancer module of the PANTHER database and website where the user may easily access the evidence for each enhancer-gene link, as well as query by target gene and enhancer location.
Journal Article
The BioPAX community standard for pathway data sharing
2010
Incompatible data storage formats have hindered the sharing and analyses of digital representations of biological pathways. BioPAX is a standardized language supported by >40 databases and software tools for exchanging pathway data.
Biological Pathway Exchange (BioPAX) is a standard language to represent biological pathways at the molecular and cellular level and to facilitate the exchange of pathway data. The rapid growth of the volume of pathway data has spurred the development of databases and computational tools to aid interpretation; however, use of these data is hampered by the current fragmentation of pathway information across many databases with incompatible formats. BioPAX, which was created through a community process, solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. Using BioPAX, millions of interactions, organized into thousands of pathways, from many organisms are available from a growing number of databases. This large amount of pathway data in a computable form will support visualization, analysis and biological discovery.
Journal Article
Pathway polygenic risk scores (pPRS) for the analysis of gene-environment interaction
by
Keku, Temitope
,
Fu, Yubo
,
Li, Li
in
Anti-Inflammatory Agents, Non-Steroidal - therapeutic use
,
Biology and Life Sciences
,
Cancer research
2025
A polygenic risk score (PRS) is used to quantify the combined disease risk of many genetic variants. For complex human traits there is interest in determining whether the PRS modifies, i.e. interacts with, important environmental (E) risk factors. Detection of a PRS by environment (PRS x E) interaction may provide clues to underlying biology and can be useful in developing targeted prevention strategies for modifiable risk factors. The standard PRS may include a subset of variants that interact with E but a much larger subset of variants that affect disease without regard to E. This latter subset will dilute the underlying signal in former subset, leading to reduced power to detect PRS x E interaction. We explore the use of pathway-defined PRS (pPRS) scores, using state of the art tools to annotate subsets of variants to genomic pathways. We demonstrate via simulation that testing targeted pPRS x E interaction can yield substantially greater power than testing overall PRS x E interaction. We also analyze a large study (N = 78,253) of colorectal cancer (CRC) where E = non-steroidal anti-inflammatory drugs (NSAIDs), a well-established protective exposure. While no evidence of overall PRS x NSAIDs interaction (p = 0.41) is observed, a significant pPRS x NSAIDs interaction (p = 0.0003) is identified based on SNPs within the TGF-β/ gonadotropin releasing hormone receptor (GRHR) pathway. NSAIDS is protective (OR=0.84) for those at the 5 th percentile of the TGF-β/GRHR pPRS (low genetic risk, OR), but significantly more protective (OR=0.70) for those at the 95 th percentile (high genetic risk). From a biological perspective, this suggests that NSAIDs may act to reduce CRC risk specifically through genes in these pathways. From a population health perspective, our result suggests that focusing on genes within these pathways may be effective at identifying those for whom NSAIDs-based CRC-prevention efforts may be most effective.
Journal Article
PEACOCK: a machine learning approach to assess the validity of cell type-specific enhancer-gene regulatory relationships
2023
The vast majority of disease-associated variants identified in genome-wide association studies map to enhancers, powerful regulatory elements which orchestrate the recruitment of transcriptional complexes to their target genes’ promoters to upregulate transcription in a cell type- and timing-dependent manner. These variants have implicated thousands of enhancers in many common genetic diseases, including nearly all cancers. However, the etiology of most of these diseases remains unknown because the regulatory target genes of the vast majority of enhancers are unknown. Thus, identifying the target genes of as many enhancers as possible is crucial for learning how enhancer regulatory activities function and contribute to disease. Based on experimental results curated from scientific publications coupled with machine learning methods, we developed a cell type-specific score predictive of an enhancer targeting a gene. We computed the score genome-wide for every possible cis enhancer-gene pair and validated its predictive ability in four widely used cell lines. Using a pooled final model trained across multiple cell types, all possible gene-enhancer regulatory links in cis (~17 M) were scored and added to the publicly available PEREGRINE database (www.peregrineproj.org). These scores provide a quantitative framework for the enhancer-gene regulatory prediction that can be incorporated into downstream statistical analyses.
Journal Article
Bayesian parameter estimation for automatic annotation of gene functions using observational data and phylogenetic trees
2021
Gene function annotation is important for a variety of downstream analyses of genetic data. But experimental characterization of function remains costly and slow, making computational prediction an important endeavor. Phylogenetic approaches to prediction have been developed, but implementation of a practical Bayesian framework for parameter estimation remains an outstanding challenge. We have developed a computationally efficient model of evolution of gene annotations using phylogenies based on a Bayesian framework using Markov Chain Monte Carlo for parameter estimation. Unlike previous approaches, our method is able to estimate parameters over many different phylogenetic trees and functions. The resulting parameters agree with biological intuition, such as the increased probability of function change following gene duplication. The method performs well on leave-one-out cross-validation, and we further validated some of the predictions in the experimental scientific literature.
Journal Article
Large-scale gene function analysis with the PANTHER classification system
by
Thomas, Paul D
,
Muruganujan, Anushya
,
Casagrande, John T
in
631/114/2398
,
631/114/2403
,
631/114/794
2013
The PANTHER (protein annotation through evolutionary relationship) classification system (
http://www.pantherdb.org/
) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.
Journal Article
Dopamine Genes and Nicotine Dependence in Treatment-Seeking and Community Smokers
2009
We utilized a cohort of 828 treatment-seeking self-identified white cigarette smokers (50% female) to rank candidate gene single nucleotide polymorphisms (SNPs) associated with the Fagerström Test for Nicotine Dependence (FTND), a measure of nicotine dependence which assesses quantity of cigarettes smoked and time- and place-dependent characteristics of the respondent's smoking behavior. A total of 1123 SNPs at 55 autosomal candidate genes, nicotinic acetylcholine receptors and genes involved in dopaminergic function, were tested for association to baseline FTND scores adjusted for age, depression, education, sex, and study site. SNP
P
-values were adjusted for the number of transmission models, the number of SNPs tested per candidate gene, and their intragenic correlation.
DRD2
,
SLC6A3
, and
NR4A2
SNPs with adjusted
P
-values <0.10 were considered sufficiently noteworthy to justify further genetic, bioinformatic, and literature analyses. Each independent signal among the top-ranked SNPs accounted for ∼1% of the FTND variance in this sample. The
DRD2
SNP appears to represent a novel association with nicotine dependence. The
SLC6A3
SNPs have previously been shown to be associated with
SLC6A3
transcription or dopamine transporter density
in vitro
,
in vivo
, and
ex vivo
. Analysis of
SLC6A3
and
NR4A2
SNPs identified a statistically significant gene–gene interaction (
P
=0.001), consistent with
in vitro
evidence that the
NR4A2
protein product (NURR1) regulates
SLC6A3
transcription. A community cohort of
N
=175 multiplex ever-smoking pedigrees (
N
=423 ever smokers) provided nominal evidence for association with the FTND at these top ranked SNPs, uncorrected for multiple comparisons.
Journal Article
Gene Ontology Causal Activity Modeling (GO-CAM) moves beyond GO annotations to structured descriptions of biological functions and systems
2019
To increase the utility of Gene Ontology (GO) annotations for interpretation of genome-wide experimental data, we have developed GO-CAM, a structured framework for linking multiple GO annotations into an integrated model of a biological system. We expect that GO-CAM will enable new applications in pathway and network analysis, as well as improve standard GO annotations for traditional GO-based applications.
Journal Article