Catalogue Search | MBRL

DTiGEMS+: drug–target interaction prediction using graph embedding, graph mining, and similarity-based techniques

by Gao, Xin , Thafar, Maha A. , Ashoor, Haitham in Algorithms , Analogies , Associations

2020

In silico prediction of drug–target interactions is a critical phase in the sustainable drug development process, especially when the research focus is to capitalize on the repositioning of existing drugs. However, developing such computational methods is not an easy task, but is much needed, as current methods that predict potential drug–target interactions suffer from high false-positive rates. Here we introduce DTiGEMS+, a computational method that predicts D rug– T arget i nteractions using G raph E mbedding, graph M ining, and S imilarity-based techniques. DTiGEMS+ combines similarity-based as well as feature-based approaches, and models the identification of novel drug–target interactions as a link prediction problem in a heterogeneous network. DTiGEMS+ constructs the heterogeneous network by augmenting the known drug–target interactions graph with two other complementary graphs namely: drug–drug similarity, target–target similarity. DTiGEMS+ combines different computational techniques to provide the final drug target prediction, these techniques include graph embeddings, graph mining, and machine learning. DTiGEMS+ integrates multiple drug–drug similarities and target–target similarities into the final heterogeneous graph construction after applying a similarity selection procedure as well as a similarity fusion algorithm. Using four benchmark datasets, we show DTiGEMS+ substantially improves prediction performance compared to other state-of-the-art in silico methods developed to predict of drug-target interactions by achieving the highest average AUPR across all datasets (0.92), which reduces the error rate by 33.3% relative to the second-best performing model in the state-of-the-art methods comparison.

Journal Article

Share this book

Add to My Shelf

Inverse similarity and reliable negative samples for drug side-effect prediction

by Lan, Chaowang , Li, Jinyan , Ghosh, Shameek in Adverse drug reactions , Algorithms , Bioinformatics

2019

Background In silico prediction of potential drug side-effects is of crucial importance for drug development, since wet experimental identification of drug side-effects is expensive and time-consuming. Existing computational methods mainly focus on leveraging validated drug side-effect relations for the prediction. The performance is severely impeded by the lack of reliable negative training data. Thus, a method to select reliable negative samples becomes vital in the performance improvement. Methods Most of the existing computational prediction methods are essentially based on the assumption that similar drugs are inclined to share the same side-effects, which has given rise to remarkable performance. It is also rational to assume an inverse proposition that dissimilar drugs are less likely to share the same side-effects. Based on this inverse similarity hypothesis, we proposed a novel method to select highly-reliable negative samples for side-effect prediction. The first step of our method is to build a drug similarity integration framework to measure the similarity between drugs from different perspectives. This step integrates drug chemical structures, drug target proteins, drug substituents, and drug therapeutic information as features into a unified framework. Then, a similarity score between each candidate negative drug and validated positive drugs is calculated using the similarity integration framework. Those candidate negative drugs with lower similarity scores are preferentially selected as negative samples. Finally, both the validated positive drugs and the selected highly-reliable negative samples are used for predictions. Results The performance of the proposed method was evaluated on simulative side-effect prediction of 917 DrugBank drugs, comparing with four machine-learning algorithms. Extensive experiments show that the drug similarity integration framework has superior capability in capturing drug features, achieving much better performance than those based on a single type of drug property. Besides, the four machine-learning algorithms achieved significant improvement in macro-averaging F1-score (e.g., SVM from 0.655 to 0.898), macro-averaging precision (e.g., RBF from 0.592 to 0.828) and macro-averaging recall (e.g., KNN from 0.651 to 0.772) complimentarily attributed to the highly-reliable negative samples selected by the proposed method. Conclusions The results suggest that the inverse similarity hypothesis and the integration of different drug properties are valuable for side-effect prediction. The selection of highly-reliable negative samples can also make significant contributions to the performance improvement.

Journal Article

Share this book

Add to My Shelf

ISCMF: Integrated similarity-constrained matrix factorization for drug–drug interaction prediction

by Rohani, Narjes , Katanforoush, Ali , Eslahchi, Changiz in Applications of Graph Theory and Complex Networks , Bioinformatics , Computational Biology/Bioinformatics

2020

Drug–drug interaction (DDI) prediction prepares substantial information for drug discovery. As the exact prediction of DDIs can reduce human health risk, the development of an accurate method to solve this problem is quite significant. Despite numerous studies in the field, a considerable number of DDIs are not yet identified. In the current study, we used Integrated Similarity-constrained matrix factorization (ISCMF) to predict DDIs. Eight similarities were calculated based on the drug substructure, targets, side effects, off-label side effects, pathways, transporters, enzymes, and indication data as well as Gaussian interaction profile for the drug pairs. Subsequently, a non-linear similarity fusion method was used to integrate multiple similarities and make them more informative. Finally, we employed ISCMF, which projects drugs in the interaction space into a low-rank space to obtain new insights into DDIs. However, all parts of ISCMF have been proposed in previous studies, but our novelty is applying them in DDI prediction context and combining them. We compared ISCMF with several state-of-the-art methods. The results show that It achieved more appropriate results in five-fold cross-validation. It improves AUPR, and F-measure to 10% and 18%, respectively. For further validation, we performed case studies on numerous interactions predicted by ISCMF with high probability, most of which were validated by reliable databases. Our results provide support for the notion that ISCMF might be used unequivocally as a powerful method for predicting the unknown DDIs. The data and implementation of ISCMF are available at https://github.com/nrohani/ISCMF .

Journal Article

Share this book

Add to My Shelf

Drug–target interaction prediction through fine-grained selection and bidirectional random walk methodology

by Wang, YaPing , Yin, ZhiXiang in 631/114 , 631/154 , Algorithms

2024

The study of drug–target interaction plays an important role in the process of drug development. The subject of DTI forecasting has advanced significantly in the last several years, yielding numerous significant research findings and methodologies. Heterogeneous data sources provide richer information and comprehensive perspectives for drug–target interaction prediction, so many existing methods rely on heterogeneous networks, and graph embedding technology becomes an important technology to extract information from heterogeneous networks. These approaches, however, are less concerned with potential noisy information in heterogeneous networks and more focused on the extent of information extraction in those networks. Based on this, a potential DTI predictive network model called FBRWPC is proposed in this paper. It uses a fine-grained similarity selection program to first integrate similarity on similar networks and then a bidirectional random walk graph embedding learning method with restart to obtain an updated drug target interaction matrix. Through the use of similarity selection and fine-grained selection similarity integration, the framework can effectively filter out the noise present in heterogeneous networks and enhance the model's prediction performance. The experimental findings demonstrate that, even after being split up into four distinct types of data sets, FBRWPC can still retain great prediction performance, a sign of the model's resilience and good generalization.

Journal Article

Share this book

Add to My Shelf

Hierarchical feature similarity integration method for data sets based on deep learning

by Xiaoli, Yu in Data Hierarchy , Datasets , Deep learning

2021

Under the background of the rapid rise of open-source software and the gradual popularization of various software development tools, a large amount of development activity data has been accumulated on the Internet. In the process of using these data to construct data sets, due to their poor traceability and narrow application scope, the quality of data in development activities is not high and the accuracy of analysis results is not high. The application of the hierarchical feature similarity integration method of data sets can make the multi-version and multi-level development smoother and more orderly. In this paper, a hierarchical feature similarity integration method based on hierarchical deep learning is proposed for data sets. Firstly, the dynamic mesh partitioning method is used to divide the sparse and dense regions in the space, which reduces the scale of data detection and shortens the execution time of detection. Then, through the hierarchical deep learning process, the professional knowledge and the distribution information of data attribute value are fused to realize the detection of discrete data in the database. Experimental results show that this method can accurately complete the detection of discrete data in the database in a relatively short time, and has more application advantages than traditional methods.

Journal Article

Share this book

Add to My Shelf

DTiGNN: Learning drug-target embedding from a heterogeneous biological network based on a two-level attention-based graph neural network

by Rayan, Arockia Xavier Annie , Muniyappan, Saranya , Varrieth, Geetha Thekkumpurath in Algorithms , Area Under Curve , Artificial neural networks

2023

Motivation: In vitro experiment-based drug-target interaction (DTI) exploration demands more human, financial and data resources. In silico approaches have been recommended for predicting DTIs to reduce time and cost. During the drug development process, one can analyze the therapeutic effect of the drug for a particular disease by identifying how the drug binds to the target for treating that disease. Hence, DTI plays a major role in drug discovery. Many computational methods have been developed for DTI prediction. However, the existing methods have limitations in terms of capturing the interactions via multiple semantics between drug and target nodes in a heterogeneous biological network (HBN). Methods: In this paper, we propose a DTiGNN framework for identifying unknown drug-target pairs. The DTiGNN first calculates the similarity between the drug and target from multiple perspectives. Then, the features of drugs and targets from each perspective are learned separately by using a novel method termed an information entropy-based random walk. Next, all of the learned features from different perspectives are integrated into a single drug and target similarity network by using a multi-view convolutional neural network. Using the integrated similarity networks, drug interactions, drug-disease associations, protein interactions and protein-disease association, the HBN is constructed. Next, a novel embedding algorithm called a meta-graph guided graph neural network is used to learn the embedding of drugs and targets. Then, a convolutional neural network is employed to infer new DTIs after balancing the sample using oversampling techniques. Results: The DTiGNN is applied to various datasets, and the result shows better performance in terms of the area under receiver operating characteristic curve (AUC) and area under precision-recall curve (AUPR), with scores of 0.98 and 0.99, respectively. There are 23,739 newly predicted DTI pairs in total.

Journal Article

Share this book

Add to My Shelf

Inferring non-synonymous single-nucleotide polymorphisms-disease associations via integration of multiple similarity networks

by Yang, Silu , Jiang, Rui , Wu, Jiaxin in Algorithms , Area Under Curve , binary classification problem

2014

Detecting associations between human genetic variants and their phenotypic effects is a significant problem in understanding genetic bases of human-inherited diseases. The focus is on a typical type of genetic variants called non-synonymous single nucleotide polymorphisms (nsSNPs), whose occurrence may potentially alter the structures of proteins, affecting functions of proteins, and thereby causing diseases. Most of the existing methods predict associations between nsSNPs and diseases based on features derived from only protein sequence and/or structure information, and give no information about which specific disease an nsSNP is associated with. To cope with these problems, the identification of nsSNPs that are associated with a specific disease from a set of candidate nsSNPs as a binary classification problem has been formulated. A new approach has been adopted for predicting associations between nsSNPs and diseases based on multiple nsSNP similarity networks and disease phenotype similarity networks. With a series of comprehensive validation experiments, it has been demonstrated that the proposed method is effective in both recovering the nsSNP-disease associations and inferring suspect disease-associated nsSNPs for both diseases with known genetic bases and diseases of unknown genetic bases.

Journal Article

Share this book

Add to My Shelf

Data integration by fuzzy similarity-based hierarchical clustering

by Nardone, Davide , Staiano, Antonino , Ciaramella, Angelo in Agglomeration , Algorithms , Bioinformatics

2020

Background High throughput methods, in biological and biomedical fields, acquire a large number of molecular parameters or omics data by a single experiment. Combining these omics data can significantly increase the capability for recovering fine-tuned structures or reducing the effects of experimental and biological noise in data. Results In this work we propose a multi-view integration methodology (named FH -Clust) for identifying patient subgroups from different omics information (e.g., Gene Expression , Mirna Expression , Methylation ). In particular, hierarchical structures of patient data are obtained in each omic (or view) and finally their topologies are merged by consensus matrix. One of the main aspects of this methodology, is the use of a measure of dissimilarity between sets of observations, by using an appropriate metric. For each view, a dendrogram is obtained by using a hierarchical clustering based on a fuzzy equivalence relation with Łukasiewicz valued fuzzy similarity. Finally, a consensus matrix, that is a representative information of all dendrograms, is formed by combining multiple hierarchical agglomerations by an approach based on transitive consensus matrix construction. Several experiments and comparisons are made on real data (e.g., Glioblastoma, Prostate Cancer) to assess the proposed approach. Conclusions Fuzzy logic allows us to introduce more flexible data agglomeration techniques. From the analysis of scientific literature, it appears to be the first time that a model based on fuzzy logic is used for the agglomeration of multi-omic data. The results suggest that FH -Clust provides better prognostic value and clinical significance compared to the analysis of single-omic data alone and it is very competitive with respect to other techniques from literature.

Journal Article

Share this book

Add to My Shelf

netDx: interpretable patient classification using integrated patient similarity networks

by Isserlin, Ruth , Kaka, Hussam , Bader, Gary D in Algorithms , Asthma , Asthma - classification

2019

Patient classification has widespread biomedical and clinical applications, including diagnosis, prognosis, and treatment response prediction. A clinically useful prediction algorithm should be accurate, generalizable, be able to integrate diverse data types, and handle sparse data. A clinical predictor based on genomic data needs to be interpretable to drive hypothesis‐driven research into new treatments. We describe netDx, a novel supervised patient classification framework based on patient similarity networks, which meets these criteria. In a cancer survival benchmark dataset integrating up to six data types in four cancer types, netDx significantly outperforms most other machine‐learning approaches across most cancer types. Compared to traditional machine‐learning‐based patient classifiers, netDx results are more interpretable, visualizing the decision boundary in the context of patient similarity space. When patient similarity is defined by pathway‐level gene expression, netDx identifies biological pathways important for outcome prediction, as demonstrated in breast cancer and asthma. netDx can serve as a patient classifier and as a tool for discovery of biological features characteristic of disease. We provide a free software implementation of netDx with automation workflows. Synopsis netDx is a supervised patient classification algorithm based on the paradigm of patient similarity networks. It integrates multi‐omic data and uses biological pathway information to help with model interpretability. In a cancer survival prediction benchmark, netDx performs competitively or better than a diverse panel of machine‐learning algorithms. When patient similarity is defined by pathway‐level gene expression, netDx identifies biological pathways predictive of outcome, as demonstrated in diverse data sets (breast cancer and asthma). netDx is freely available as an R package and as a Docker image. Code, tutorials and worked examples are available at: http://netdx.org . Graphical Abstract netDx is a supervised patient classification algorithm based on the paradigm of patient similarity networks. It integrates multi‐omic data and uses biological pathway information to help with model interpretability.

Journal Article

Share this book

Add to My Shelf

Similarity-based prediction method for machinery remaining useful life: A review

by Xu, Zhongbin , Xu, Huangyang , Zhu, Ke in Advanced manufacturing technologies , CAE) and Design , Computer-Aided Engineering (CAD

2022

Determining the remaining useful life (RUL) of increasingly complex machines provides the decision basis for the predictive maintenance process, which effectively ensures equipment safety, improves the utilization rate, and reduces the maintenance cost. Similarity-based prediction (SBP) methods are one type of RUL prediction technique, generally divided into four steps: condition monitoring data collection, degradation information fusion, similarity evaluation, and model prediction aggregation. SBP methods have advantages which include strong interpretability and a simple implementation process. Intensive studies and wide applications based on the SBP methods exist in both academia and industry. SBP methods have been included in numerous reviews, but they mainly focus on the first two steps or just one of the steps. Existing reviews lack recent advances of SBP methods and discussions of the four steps in detail. To fill the above gaps, this paper reviewed the whole procedure of SBP methods. Firstly, the prognostics industrial scenarios with limited failure data and sufficient failure data are introduced. Then, the degradation indicators (DIs) of the machines are constructed through a fusion of degradation information. Later, similarity calculation and similarity matching rule are utilized to evaluate the similarity of the DIs segments. After that, point estimation and uncertainty management are acquired by integrating the referential DIs segments. Finally, the effectiveness of the SBP methods in different industrial scenarios, the limitations, and future challenges are discussed.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter