Catalogue Search | MBRL

A Simplified Quantum Walk Model for Predicting Missing Links of Complex Networks

by Kaoru Hirota , Wen Liang , Abdullah M. Iliyasu in Accuracy , Algorithms , Analysis

2022

Prediction of missing links is an important part of many applications, such as friends’ recommendations on social media, reduction of economic cost of protein functional modular mining, and implementation of accurate recommendations in the shopping platform. However, the existing algorithms for predicting missing links fall short in the accuracy and the efficiency. To ameliorate these, we propose a simplified quantum walk model whose Hilbert space dimension is only twice the number of nodes in a complex network. This property facilitates simultaneous consideration of the self-loop of each node and the common neighbour information between arbitrary pair of nodes. These effects decrease the negative effect generated by the interference effect in quantum walks while also recording the similarity between nodes and its neighbours. Consequently, the observed probability after the two-step walk is utilised to represent the score of each link as a missing link, by which extensive computations are omitted. Using the AUC index as a performance metric, the proposed model records the highest average accuracy in the prediction of missing links compared to 14 competing algorithms in nine real complex networks. Furthermore, experiments using the precision index show that our proposed model ranks in the first echelon in predicting missing links. These performances indicate the potential of our simplified quantum walk model for applications in network alignment and functional modular mining of protein–protein networks.

Journal Article

Share this book

Add to My Shelf

Predicting potential drug-drug interactions by integrating chemical, biological, phenotypic and network data

by Chen, Yanlin , Li, Xiaohong , Liu, Feng in Algorithms , Bioinformatics , Biomedical and Life Sciences

2017

Background Drug-drug interactions (DDIs) are one of the major concerns in drug discovery. Accurate prediction of potential DDIs can help to reduce unexpected interactions in the entire lifecycle of drugs, and are important for the drug safety surveillance. Results Since many DDIs are not detected or observed in clinical trials, this work is aimed to predict unobserved or undetected DDIs. In this paper, we collect a variety of drug data that may influence drug-drug interactions, i.e., drug substructure data, drug target data, drug enzyme data, drug transporter data, drug pathway data, drug indication data, drug side effect data, drug off side effect data and known drug-drug interactions. We adopt three representative methods: the neighbor recommender method, the random walk method and the matrix perturbation method to build prediction models based on different data. Thus, we evaluate the usefulness of different information sources for the DDI prediction. Further, we present flexible frames of integrating different models with suitable ensemble rules, including weighted average ensemble rule and classifier ensemble rule, and develop ensemble models to achieve better performances. Conclusions The experiments demonstrate that different data sources provide diverse information, and the DDI network based on known DDIs is one of most important information for DDI prediction. The ensemble methods can produce better performances than individual methods, and outperform existing state-of-the-art methods. The datasets and source codes are available at https://github.com/zw9977129/drug-drug-interaction/ .

Journal Article

Share this book

Add to My Shelf

Missing link prediction and spurious link detection based on attractive force and community

by Chen, Wei , Chi, Kuo , Qu, Hui in Algorithms , Diffusion , Information dissemination

2021

With the rapid development of Internet and information technology, networks have become an important media of information diffusion in the global. In view of the increasing scale of network data, how to ensure the completeness and accuracy of the obtainable links from networks has been an urgent problem that needs to be solved. Different from most traditional link prediction methods only focus on the missing links, a novel link prediction approach is proposed in this paper to handle both the missing links and the spurious links in networks. At first, we define the attractive force for any pair of nodes to denote the strength of the relation between them. Then, all the nodes can be divided into some communities according to their degrees and the attractive force on them. Next, we define the connection probability for each pair of unconnected nodes to measure the possibility if they are connected, the missing links can be predicted by calculating and comparing the connection probabilities of all the pairs of unconnected nodes. Moreover, we define the break probability for each pair of connected nodes to measure the possibility if they are broken, the spurious links can also be detected by calculating and comparing the break probabilities of all the pairs of connected nodes. To verify the validity of the proposed approach, we conduct experiments on some real-world networks. The results show the proposed approach can achieve higher prediction accuracy and more stable performance compared with some existing methods.

Journal Article

Share this book

Add to My Shelf

MSGWO-MKL-SVM: A Missing Link Prediction Method for UAV Swarm Network Based on Time Series

by Zhou, Xin , Zhang, Jie , Wang, Tao in Algorithms , Assaults , Connectivity

2022

Missing link prediction technology (MLP) is always a hot research area in the field of complex networks, and it has been extensively utilized in UAV swarm network reconstruction recently. UAV swarm is an artificial network with strong randomness, in the face of which prediction methods based on network similarity often perform poorly. To solve those problems, this paper proposes a Multi Kernel Learning algorithm with a multi-strategy grey wolf optimizer based on time series (MSGWO-MKL-SVM). The Multiple Kernel Learning (MKL) method is adopted in this algorithm to extract the advanced features of time series, and the Support Vector Machine (SVM) algorithm is used to determine the hyperplane of threshold value in nonlinear high dimensional space. Besides that, we propose a new measurable indicator of Multiple Kernel Learning based on cluster, transforming a Multiple Kernel Learning problem into a multi-objective optimization problem. Some adaptive neighborhood strategies are used to enhance the global searching ability of grey wolf optimizer algorithm (GWO). Comparison experiments were conducted on the standard UCI datasets and the professional UAV swarm datasets. The classification accuracy of MSGWO-MKL-SVM on UCI datasets is improved by 6.2% on average, and the link prediction accuracy of MSGWO-MKL-SVM on professional UAV swarm datasets is improved by 25.9% on average.

Journal Article

Share this book

Add to My Shelf

Finding missing links in interaction networks

by Terry, J. Christopher D. , Lewis, Owen T. in Biota , bipartite network , Computer simulation

2020

Documenting which species interact within ecological communities is challenging and labor intensive. As a result, many interactions remain unrecorded, potentially distorting our understanding of network structure and dynamics. We test the utility of four structural models and a new coverage-deficit model for predicting missing links in both simulated and empirical bipartite networks. We find they can perform well, although the predictive power of structural models varies with the underlying network structure. The accuracy of predictions can be improved by ensembling multiple models. Augmenting observed networks with mostlikely missing links improves estimates of qualitative network metrics. Tools to identify likely missing links can be simple to implement, allowing the prioritization of research effort and more robust assessment of network properties.

Journal Article

Share this book

Add to My Shelf

A new perspective for revealing ‘hidden’ interactions in ecological networks

by Lemke, Peter , Habedank, Joel , Sidorenko, Vera in Algorithms , biomonitoring , Connectivity

2025

Ecological network models are essential for developing and quantifying ecosystem‐based management strategies. Unobserved species interactions alter the interpretation of structural and functional characteristics of the ecosystem being studied. Link prediction algorithms can help to identify such unobserved, ‘hidden’ interactions. However, due to general unfamiliarity and insufficient ecological interpretations, the use of link prediction algorithms in ecology remains limited. In this study, we enhance the link prediction applicability in ecological networks by considering and quantifying the algorithm results from the link as well as the node perspective using a coastal food web model from the northern Wadden Sea as a case study. For this purpose, we have defined the Weighted Unobserved Node Connectivity (WUNC) representing a new node property. The WUNC facilitates the estimation of the missing connectivity of a species in relation to a considered original source network. Such a new combination of both link and node perspectives helps to uncover unobserved interactions as well as the resulting lack of species connectivity in poorly understood environments without active sampling. The bi‐dimensional perspective presented in this study provides a more effective use of link prediction algorithms to identify and prioritize under‐connected species and their unobserved interactions. This enables the design of more targeted, species‐specific measurement campaigns to validate predicted interactions, thereby supporting refinements of existing ecological network models. A more comprehensive representation of interactions in ecological network models contributes to more accurate modelling results and improves their interpretation to support better management strategies in times of environmental changes.

Journal Article

Share this book

Add to My Shelf

Median-KNN Regressor-SMOTE-Tomek Links for Handling Missing and Imbalanced Data in Air Quality Prediction

by Resti, Yulia , Chandra, Winoto , Suprihatin, Bambang in Air pollution , Air quality , Air quality indexes

2023

The Air Quality Index (AQI) dataset contains information on measurements of pollutants and ambient air quality conditions at certain location that can be used to predict air quality. Unfortunately, this dataset often has many missing observations and imbalanced classes. Both of these problems can affect the performance of the prediction model. In particular, predictions for the minority class are very important because inaccurate predictions can be fatal or cause big losses. Moreover, the missing data may lead to biased results. This paper proposes the single imputation of the median and the multiple imputations of the k-Nearest Neighbor (KNN) regressor to handle missing values of less than or equal to 10% and more than 10%, respectively. At the same time, the SMOTE-Tomek Links address the imbalanced class. These proposed approaches to handle both issues are then used to assess the air quality prediction of the India AQI dataset using Naive Bayes (NB), KNN, and C4.5. The five treatments show that the proposed method of the Median-KNN regressor-SMOTE-Tomek Links is able to improve the performance of the India air quality prediction model. In other words, the proposed method succeeds in overcoming the problems of missing values and class imbalance.

Journal Article

Share this book

Add to My Shelf

Normalized L3-based link prediction in protein–protein interaction networks

by Jansson, J , Yuen, HY in Algorithms , Bioinformatics , Biology (General)

2023

Background Protein–protein interaction (PPI) data is an important type of data used in functional genomics. However, high-throughput experiments are often insufficient to complete the PPI interactome of different organisms. Computational techniques are thus used to infer missing data, with link prediction being one such approach that uses the structure of the network of PPIs known so far to identify non-edges whose addition to the network would make it more sound, according to some underlying assumptions. Recently, a new idea called the L3 principle introduced biological motivation into PPI link predictions, yielding predictors that are superior to general-purpose link predictors for complex networks. Interestingly, the L3 principle can be interpreted in another way, so that other signatures of PPI networks can also be characterized for PPI predictions. This alternative interpretation uncovers candidate PPIs that the current L3-based link predictors may not be able to fully capture, underutilizing the L3 principle. Results In this article, we propose a formulation of link predictors that we call NormalizedL3 ( L3N ) which addresses certain missing elements within L3 predictors in the perspective of network modeling. Our computational validations show that the L3N predictors are able to find missing PPIs more accurately (in terms of true positives among the predicted PPIs) than the previously proposed methods on several datasets from the literature, including BioGRID, STRING, MINT, and HuRI, at the cost of using more computation time in some of the cases. In addition, we found that L3-based link predictors (including L3N) ranked a different pool of PPIs higher than the general-purpose link predictors did. This suggests that different types of PPIs can be predicted based on different topological assumptions, and that even better PPI link predictors may be obtained in the future by improved network modeling.

Journal Article

Share this book

Add to My Shelf

Conformal link prediction for false discovery rate control

by Marandon, Ariane in Classification , Food chains , Graph theory

2024

Most link prediction methods return estimates of the connection probability of missing edges in a graph. Such output can be used to rank the missing edges from most to least likely to be a true edge, but does not directly provide a classification into true and nonexistent. In this work, we consider the problem of identifying a set of true edges with a control of the false discovery rate (FDR). We propose a novel method based on high-level ideas from the literature on conformal inference. The graph structure induces intricate dependence in the data, which we carefully take into account, as this makes the setup different from the usual setup in conformal inference, where data exchangeability is assumed. The FDR control is empirically demonstrated for both simulated and real data.

Journal Article

Share this book

Add to My Shelf

Link prediction in heterogeneous data via generalized coupled tensor factorization

by Ermiş, Beyza , Acar, Evrim , Cemgil, A. Taylan in Arrays , Artificial Intelligence , Chemistry and Earth Sciences

2015

This study deals with missing link prediction, the problem of predicting the existence of missing connections between entities of interest. We approach the problem as filling in missing entries in a relational dataset represented by several matrices and multiway arrays, that will be simply called tensors . Consequently, we address the link prediction problem by data fusion formulated as simultaneous factorization of several observation tensors where latent factors are shared among each observation. Previous studies on joint factorization of such heterogeneous datasets have focused on a single loss function (mainly squared Euclidean distance or Kullback–Leibler-divergence) and specific tensor factorization models (CANDECOMP/PARAFAC and/or Tucker). However, in this paper, we study various alternative tensor models as well as loss functions including the ones already studied in the literature using the generalized coupled tensor factorization framework. Through extensive experiments on two real-world datasets, we demonstrate that (i) joint analysis of data from multiple sources via coupled factorization significantly improves the link prediction performance, (ii) selection of a suitable loss function and a tensor factorization model is crucial for accurate missing link prediction and loss functions that have not been studied for link prediction before may outperform the commonly-used loss functions, (iii) joint factorization of datasets can handle difficult cases, such as the cold start problem that arises when a new entity enters the dataset, and (iv) our approach is scalable to large-scale data.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter