Catalogue Search | MBRL

Systematic integration of biomedical knowledge prioritizes drugs for repurposing

by Baranzini, Sergio E , Hessler, Christine , Himmelstein, Daniel Scott in Alcoholism , Algorithms , Computational and Systems Biology

2017

The ability to computationally predict whether a compound treats a disease would improve the economy and success rate of drug approval. This study describes Project Rephetio to systematically model drug efficacy based on 755 existing treatments. First, we constructed Hetionet (neo4j.het.io), an integrative network encoding knowledge from millions of biomedical studies. Hetionet v1.0 consists of 47,031 nodes of 11 types and 2,250,197 relationships of 24 types. Data were integrated from 29 public resources to connect compounds, diseases, genes, anatomies, pathways, biological processes, molecular functions, cellular components, pharmacologic classes, side effects, and symptoms. Next, we identified network patterns that distinguish treatments from non-treatments. Then, we predicted the probability of treatment for 209,168 compound–disease pairs (het.io/repurpose). Our predictions validated on two external sets of treatment and provided pharmacological insights on epilepsy, suggesting they will help prioritize drug repurposing candidates. This study was entirely open and received realtime feedback from 40 community members. Of all the data in the world today, 90% was created in the last two years. However, taking advantage of this data in order to advance our knowledge is restricted by how quickly we can access it and analyze it in a proper context. In biomedical research, data is largely fragmented and stored in databases that typically do not “talk” to each other, thus hampering progress. One particular problem in medicine today is that the process of making a new therapeutic drug from scratch is incredibly expensive and inefficient, making it a risky business. Given the low success rate in drug discovery, there is an economic incentive in trying to repurpose an existing drug that has already been shown to be safe and effective towards a new disease or condition. Himmelstein et al. used a computational approach to analyze 50,000 data points – including drugs, diseases, genes and symptoms – from 19 different public databases. This approach made it possible to create more than two million relationships among the data points, which could be used to develop models that predict which drugs currently in use by doctors might be best suited to treat any of 136 common diseases. For example, Himmelstein et al. identified specific drugs currently used to treat depression and alcoholism that could be repurposed to treat smoking addition and epilepsy. These findings provide a new and powerful way to study drug repurposing. While this work was exclusively performed with public data, an expanded and potentially stronger set of predictions could be obtained if data owned by pharmaceutical companies were incorporated. Additional studies will be needed to test the predictions made by the models.

Journal Article

Share this book

Add to My Shelf

Knowledge transfer-driven estimation of knee moments and ground reaction forces from smartphone videos via temporal-spatial modeling of augmented joint kinematics

by Yoo, Sunyong , Shin, Hyunjun , Hossain, Md Sanzid Bin in Accuracy , Analysis , Artificial neural networks

2025

The knee adduction and flexion moment provides critical information about knee joint health, while 3D ground reaction forces (GRFs) help identify force and energy characteristics for maneuvering the entire human body. Existing methods of acquiring joint moments and GRFs require expensive equipment, time-consuming pre-processing, and limited accessibility. This study proposes to tackle these limitations by utilizing only smartphone videos to estimate joint moments and 3D GRFs accurately. We also propose the augmentation of joint kinematics by generating additional modalities of 2D joint center velocity and acceleration from 2D joint center position acquired from the videos. This augmented joint kinematics helps to apply a multi-modal fusion module to learn the importance of inter-modal interactions. Additionally, we utilize recurrent neural networks and graph convolutional networks to perform temporal-spatial modeling of joint center dynamics for enhanced accuracy. To overcome another challenge of video-based estimation, particularly the lack of inertial information related to body segments, we propose multi-modal knowledge transfer to train the video-only student model from a teacher model that integrates both video and inertial measurement unit (IMU) data. The student model significantly reduces the normalized root mean square error (NRMSE) from 5.71 to 4.68 and increases the Pearson correlation coefficient (PCC) from 0.929 to 0.951. These results demonstrate that knowledge transfer, augmentation of joint kinematics for multi-modal fusion, and temporal-spatial modeling significantly enhance smartphone video-based estimation, offering a potential cost-effective alternative to traditional motion capture for clinical assessments, rehabilitation, and sports applications.

Journal Article

Share this book

Add to My Shelf

Development and Validation of an Electronic Health Record–Based Machine Learning Model to Estimate Delirium Risk in Newly Hospitalized Patients Without Known Cognitive Impairment

by Wong, Andrew , Liang, April S. , Douglas, Vanja C. in Adolescent , Adult , Aged

2018

Current methods for identifying hospitalized patients at increased risk of delirium require nurse-administered questionnaires with moderate accuracy. To develop and validate a machine learning model that predicts incident delirium risk based on electronic health data available on admission. Retrospective cohort study evaluating 5 machine learning algorithms to predict delirium using 796 clinical variables identified by an expert panel as relevant to delirium prediction and consistently available in electronic health records within 24 hours of admission. The training set comprised 14 227 adult patients with non-intensive care unit hospital stays and no delirium on admission who were discharged between January 1, 2016, and August 31, 2017, from UCSF Health, a large academic health institution. The test set comprised 3996 patients with hospital stays who were discharged between August 1, 2017, and November 30, 2017. Patient demographic characteristics, diagnoses, nursing records, laboratory results, and medications available in electronic health records during hospitalization. Delirium was defined as a positive Nursing Delirium Screening Scale or Confusion Assessment Method for the Intensive Care Unit score. Models were assessed using the area under the receiver operating characteristic curve (AUC) and compared against the 4-point scoring system AWOL (age >79 years, failure to spell world backward, disorientation to place, and higher nurse-rated illness severity), a validated delirium risk-assessment tool routinely administered in this cohort. The training set included 14 227 patients (5113 [35.9%] aged >64 years; 7335 [51.6%] female; 687 [4.8%] with delirium), and the test set included 3996 patients (1491 [37.3%] aged >64 years; 1966 [49.2%] female; 191 [4.8%] with delirium). In total, the analysis included 18 223 hospital admissions (6604 [36.2%] aged >64 years; 9301 [51.0%] female; 878 [4.8%] with delirium). The AWOL system achieved a baseline AUC of 0.678. The gradient boosting machine model performed best, with an AUC of 0.855. Setting specificity at 90%, the model had a 59.7% (95% CI, 52.4%-66.7%) sensitivity, 23.1% (95% CI, 20.5%-25.9%) positive predictive value, 97.8% (95% CI, 97.4%-98.1%) negative predictive value, and a number needed to screen of 4.8. Penalized logistic regression and random forest models also performed well, with AUCs of 0.854 and 0.848, respectively. Machine learning can be used to estimate hospital-acquired delirium risk using electronic health record data available within 24 hours of hospital admission. Such a model may allow more precise targeting of delirium prevention resources without increasing the burden on health care professionals.

Journal Article

Share this book

Add to My Shelf

Are minor alleles more likely to be risk alleles?

by Patwardhan, Anil , Chen, Richard , Kodama, Keiichi in Alleles , Analysis , Bioinformatic and algorithmical studies

2018

Background Genome-wide association studies (GWASs) have revealed relationships between over 57,000 genetic variants and diseases. However, unlike Mendelian diseases, complex diseases arise from the interplay of multiple genetic and environmental factors. Natural selection has led to a high tendency of risk alleles to be enriched in minor alleles in Mendelian diseases. Therefore, an allele that was previously advantageous or neutral may later become harmful, making it a risk allele. Methods Using data in the NHGRI-EBI Catalog and the VARIMED database, we investigated whether (1) GWASs more easily detect risk alleles and (2) facilitate evolutionary insights by comparing risk allele frequencies of different diseases. We conducted computer simulations of P -values for association tests when major and minor alleles were risk alleles. We compared the expected proportion of SNVs whose risk alleles were minor alleles with the observed proportion. Results Our statistical results revealed that risk alleles were enriched in minor alleles, especially for variants with low minor allele frequencies (MAFs < 0.1). Our computer simulations revealed that > 50% risk alleles were minor alleles because of the larger difference in the power of GWASs to differentiate between minor and major alleles, especially with low MAFs or when the number of controls exceeds the number of cases. However, the observed ratios between minor and major alleles in low MAFs (< 0.1) were much larger than the expected ratios of GWAS’s power imbalance, especially for diseases whose average risk allele frequencies were low, such as myopia, sudden cardiac arrest, and systemic lupus erythematosus. Conclusions Minor alleles are more likely to be risk alleles in the published GWASs on complex diseases. One reason is that minor alleles are more easily detected as risk alleles in GWASs. Even when correcting for the GWAS’s power imbalance, minor alleles are more likely to be risk alleles, especially in some diseases whose average risk allele frequencies are low. These analyses serve as a starting point for future studies on quantifying the degree of negative natural selection in various complex diseases.

Journal Article

Share this book

Add to My Shelf

Relating hepatocellular carcinoma tumor samples and cell lines using gene expression data in translational research

by Fan-Minogue, Hua , Chen, Bin , Butte, Atul J in Biomedical and Life Sciences , Biomedicine , Carcinoma, Hepatocellular - genetics

2015

Cancer cell lines are used extensively to study cancer biology and to test hypotheses in translational research. The relevance of cell lines is dependent on how closely they resemble the tumors being studied. Relating tumors and cell lines, and recognizing their similarities and differences are thus very important for translational research. Rapid advances in genomics have led to the generation of large volumes of genomic and transcriptomic data for a diverse set of primary cancer samples, normal tissue samples and cancer cell lines. Hepatocellular Carcinoma (HCC) is one of the most common tumors worldwide, with high occurrence in Asia and sub-Saharan regions. The current effective treatments of HCC remain limited. In this work, we compared the gene expression measurements of 200 HCC tumor samples from The Cancer Genome Atlas and over 1000 cancer cell lines including 25 HCC cancer cell lines from Cancer Cell Line Encyclopedia. We showed that the HCC tumor samples correlate closely with HCC cell lines in comparison to cell lines derived from other tumor types. We further demonstrated that the most commonly used HCC cell lines resemble HCC tumors, while we identified nearly half of the cell lines that do not resemble primary tumors. Interestingly, a substantial number of genes that are critical for disease development or drug response are either expressed at low levels or absent among highly correlated cell lines; additional attention should be paid to these genes in translational research. Our study will be used to guide the selection of HCC cell lines and pinpoint the specific genes that are differentially expressed in either tumors or cell lines.

Journal Article

Share this book

Add to My Shelf

Prediction of future healthcare expenses of patients from chest radiographs using deep learning: a pilot study

by Vu, Thienkhai H. , Seo, Youngho , Yang, Jaewon in 639/166/985 , 639/705/117 , 692/499

2022

Our objective was to develop deep learning models with chest radiograph data to predict healthcare costs and classify top-50% spenders. 21,872 frontal chest radiographs were retrospectively collected from 19,524 patients with at least 1-year spending data. Among the patients, 11,003 patients had 3 years of cost data, and 1678 patients had 5 years of cost data. Model performances were measured with area under the receiver operating characteristic curve (ROC-AUC) for classification of top-50% spenders and Spearman ρ for prediction of healthcare cost. The best model predicting 1-year (N = 21,872) expenditure achieved ROC-AUC of 0.806 [95% CI 0.793–0.819] for top-50% spender classification and ρ of 0.561 [0.536–0.586] for regression. Similarly, for predicting 3-year (N = 12,395) expenditure, ROC-AUC of 0.771 [0.750–0.794] and ρ of 0.524 [0.489–0.559]; for predicting 5-year (N = 1779) expenditure ROC-AUC of 0.729 [0.667–0.729] and ρ of 0.424 [0.324–0.529]. Our deep learning model demonstrated the feasibility of predicting health care expenditure as well as classifying top 50% healthcare spenders at 1, 3, and 5 year(s), implying the feasibility of combining deep learning with information-rich imaging data to uncover hidden associations that may allude to physicians. Such a model can be a starting point of making an accurate budget in reimbursement models in healthcare industries.

Journal Article

Share this book

Add to My Shelf

Cis-eQTL-based trans-ethnic meta-analysis reveals novel genes associated with breast cancer risk

by Leong, Lancelote , Ziv, Elad , Zaitlen, Noah in Biology and Life Sciences , BRCA mutations , Breast - metabolism

2017

Breast cancer is the most common solid organ malignancy and the most frequent cause of cancer death among women worldwide. Previous research has yielded insights into its genetic etiology, but there remains a gap in the understanding of genetic factors that contribute to risk, and particularly in the biological mechanisms by which genetic variation modulates risk. The National Cancer Institute's \"Up for a Challenge\" (U4C) competition provided an opportunity to further elucidate the genetic basis of the disease. Our group leveraged the seven datasets made available by the U4C organizers and data from the publicly available UK Biobank cohort to examine associations between imputed gene expression and breast cancer risk. In particular, we used reference datasets describing the breast tissue and whole blood transcriptomes to impute expression levels in breast cancer cases and controls. In trans-ethnic meta-analyses of U4C and UK Biobank data, we found significant associations between breast cancer risk and the expression of RCCD1 (joint p-value: 3.6x10-06) and DHODH (p-value: 7.1x10-06) in breast tissue, as well as a suggestive association for ANKLE1 (p-value: 9.3x10-05). Expression of RCCD1 in whole blood was also suggestively associated with disease risk (p-value: 1.2x10-05), as were expression of ACAP1 (p-value: 1.9x10-05) and LRRC25 (p-value: 5.2x10-05). While genome-wide association studies (GWAS) have implicated RCCD1 and ANKLE1 in breast cancer risk, they have not identified the remaining three genes. Among the genetic variants that contributed to the predicted expression of the five genes, we found 23 nominally (p-value < 0.05) associated with breast cancer risk, among which 15 are not in high linkage disequilibrium with risk variants previously identified by GWAS. In summary, we used a transcriptome-based approach to investigate the genetic underpinnings of breast carcinogenesis. This approach provided an avenue for deciphering the functional relevance of genes and genetic variants involved in breast cancer.

Journal Article

Share this book

Add to My Shelf

Meta-Analysis illustrates possible role of lipopolysaccharide (LPS)-induced tissue injury in nasopharyngeal carcinoma (NPC) pathogenesis

by Wanner, Ross A. , Hadley, Dexter , Aljabban, Jihad in Biology and Life Sciences , Biotechnology , Cancer

2021

Nasopharyngeal carcinoma (NPC) is a cancer of epithelial origin with a high incidence in certain populations. While NPC has a high remission rate with concomitant chemoradiation, recurrences are frequent, and the downstream morbidity of treatment is significant. Thus, it is imperative to find alternative therapies. We employed a Search Tag Analyze Resource (STARGEO) platform to conduct a meta-analysis using the National Center for Biotechnology's (NCBI) Gene Expression Omnibus (GEO) to define NPC pathogenesis. We identified 111 tumor samples and 43 healthy nasopharyngeal epithelium samples from NPC public patient data. We analyzed associated signatures in Ingenuity Pathway Analysis (IPA), restricting genes that showed statistical significance (p<0.05) and an absolute experimental log ratio greater than 0.15 between disease and control samples. Our meta-analysis identified activation of lipopolysaccharide (LPS)-induced tissue injury in NPC tissue. Additionally, interleukin-1 (IL-1) and SB203580 were the top upstream regulators. Tumorigenesis-related genes such as homeobox A10 (HOXA10) and prostaglandin-endoperoxide synthase 2 (PTGS2 or COX-2) as well as those associated with extracellular matrix degradation, such as matrix metalloproteinases 1 and 3 (MMP-1, MMP-3) were also upregulated. Decreased expression of genes that encode proteins associated with maintaining healthy nasal respiratory epithelium structural integrity, including sentan-cilia apical structure protein (SNTN) and lactotransferrin (LTF) was documented. Importantly, we found that etanercept inhibits targets upregulated in NPC and LPS induction, such as MMP-1, PTGS2, and possibly MMP-3. Our analysis illustrates that nasal epithelial barrier dysregulation and maladaptive immune responses are key components of NPC pathogenesis along with LPS-induced tissue damage.

Journal Article

Share this book

Add to My Shelf

A merged microarray meta-dataset for transcriptionally profiling colorectal neoplasm formation and progression

by Nakkina, Sai Preethi , Zhu, Xiang , Altomare, Deborah in 631/114/2407 , 631/67/69 , Adenoma

2021

Transcriptional profiling of pre- and post-malignant colorectal cancer (CRC) lesions enable temporal monitoring of molecular events underlying neoplastic progression. However, the most widely used transcriptomic dataset for CRC, TCGA-COAD, is devoid of adenoma samples, which increases reliance on an assortment of disparate microarray studies and hinders consensus building. To address this, we developed a microarray meta-dataset comprising 231 healthy, 132 adenoma, and 342 CRC tissue samples from twelve independent studies. Utilizing a stringent analytic framework, select datasets were downloaded from the Gene Expression Omnibus, normalized by frozen robust multiarray averaging and subsequently merged. Batch effects were then identified and removed by empirical Bayes estimation (ComBat). Finally, the meta-dataset was filtered for low variant probes, enabling downstream differential expression as well as quantitative and functional validation through cross-platform correlation and enrichment analyses, respectively. Overall, our meta-dataset provides a robust tool for investigating colorectal adenoma formation and malignant transformation at the transcriptional level with a pipeline that is modular and readily adaptable for similar analyses in other cancer types. Measurement(s) transcriptome • colorectal cancer Technology Type(s) microarray Factor Type(s) gene expression Sample Characteristic - Organism Homo sapiens Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.14589006

Journal Article

Share this book

Add to My Shelf

Copy number variation meta-analysis reveals a novel duplication at 9p24 associated with multiple neurodevelopmental disorders

by Desai, Akshatha , Li, Qingqin , Sleiman, Patrick M. A. in Attention deficit hyperactivity disorder , Autism , Bioinformatics

2017

Background Neurodevelopmental and neuropsychiatric disorders represent a wide spectrum of heterogeneous yet inter-related disease conditions. The overlapping clinical presentations of these diseases suggest a shared genetic etiology. We aim to identify shared structural variants spanning the spectrum of five neuropsychiatric disorders. Methods We investigated copy number variations (CNVs) in five cohorts, including schizophrenia (SCZ), bipolar disease (BD), autism spectrum disorders (ASD), attention deficit hyperactivity disorder (ADHD), and depression, from 7849 cases and 10,799 controls. CNVs were called based on intensity data from genome-wide SNP arrays and CNV frequency was compared between cases and controls in each disease cohort separately. Meta-analysis was performed via a gene-based approach. Quantitative PCR (qPCR) was employed to validate novel significant loci. Results In our meta-analysis, two genes containing CNVs with exonic overlap reached genome-wide significance threshold of meta P value < 9.4 × 10 −6 for deletions and 7.5 × 10 −6 for duplications. We observed significant overlap between risk CNV loci across cohorts. In addition, we identified novel significant associations of DOCK8 / KANK1 duplications (meta P value = 7.5 × 10 −7 ) across all cohorts, and further validated the CNV region with qPCR. Conclusions In the first large scale meta-analysis of CNVs across multiple neurodevelopmental/psychiatric diseases, we uncovered novel significant associations of structural variants in the locus of DOCK8 / KANK1 shared by five diseases, suggesting common etiology of these clinically distinct neurodevelopmental conditions.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter