Catalogue Search | MBRL

A Survey on Knowledge Graph Embeddings for Link Prediction

by Wang, Meihong , Qiu, Linling , Wang, Xiaoli

2021

Knowledge graphs (KGs) have been widely used in the field of artificial intelligence, such as in information retrieval, natural language processing, recommendation systems, etc. However, the open nature of KGs often implies that they are incomplete, having self-defects. This creates the need to build a more complete knowledge graph for enhancing the practical utilization of KGs. Link prediction is a fundamental task in knowledge graph completion that utilizes existing relations to infer new relations so as to build a more complete knowledge graph. Numerous methods have been proposed to perform the link-prediction task based on various representation techniques. Among them, KG-embedding models have significantly advanced the state of the art in the past few years. In this paper, we provide a comprehensive survey on KG-embedding models for link prediction in knowledge graphs. We first provide a theoretical analysis and comparison of existing methods proposed to date for generating KG embedding. Then, we investigate several representative models that are classified into five categories. Finally, we conducted experiments on two benchmark datasets to report comprehensive findings and provide some new insights into the strengths and weaknesses of existing models.

Journal Article

Share this book

Add to My Shelf

Learning Translation-Based Knowledge Graph Embeddings by N-Pair Translation Loss

by Park, Seong-Bae , Kim, A-Yeong , Song, Hyun-Je in Artificial intelligence , Design and construction , Knowledge

2020

Translation-based knowledge graph embeddings learn vector representations of entities and relations by treating relations as translation operators over the entities in an embedding space. Since the translation is represented through a score function, translation-based embeddings are trained in general by minimizing a margin-based ranking loss, which assigns a low score to positive triples and a high score to negative triples. However, this type of embedding suffers from slow convergence and poor local optima because the loss adopts only one pair of a positive and a negative triple at a single update of learning parameters. Therefore, this paper proposes the N-pair translation loss that considers multiple negative triples at one update. The N-pair translation loss employs multiple negative triples as well as one positive triple and allows the positive triple to be compared against the multiple negative triples at each parameter update. As a result, it becomes possible to obtain better vector representations rapidly. The experimental results on link prediction prove that the proposed loss helps to quickly converge toward good optima at the early stage of training.

Journal Article

Share this book

Add to My Shelf

Relational Learning Analysis of Social Politics using Knowledge Graph Embedding

by Aljarah Ibrahim , Chan Kit Yan , Al-Tawil, Marwan in Clustering , Credibility , Digital media

2021

Knowledge Graphs (KGs) have gained considerable attention recently from both academia and industry. In fact, incorporating graph technology and the copious of various graph datasets have led the research community to build sophisticated graph analytics tools, which has extended the application of KGs to tackle a plethora of real-life problems in dissimilar domains. Despite the abundance of the currently proliferated generic KGs, there is a vital need to construct domain-specific KGs. Further, quality and credibility should be assimilated in the process of constructing and augmenting KGs, particularly those propagated from mixed-quality resources such as social media data. For example, the amount of the political discourses in social media is overwhelming yet can be hijacked and misused by spammers to spread misinformation and false news. This paper presents a novel credibility domain-based KG Embedding framework. This framework involves capturing a fusion of data related to politics domain and obtained from heterogeneous resources into a formal KG representation depicted by a politics domain ontology. The proposed approach makes use of various knowledge-based repositories to enrich the semantics of the textual contents, thereby facilitating the interoperability of information. The proposed framework also embodies a domain-based social credibility module to ensure data quality and trustworthiness. The utility of the proposed framework is verified by means of experiments conducted on two constructed KGs. The KGs are then embedded in low-dimensional semantically-continuous space using several embedding techniques. The effectiveness of embedding techniques and social credibility module is further demonstrated and substantiated on link prediction, clustering, and visualisation tasks.

Journal Article

Share this book

Add to My Shelf

A Survey on Knowledge Graph Embedding: Approaches, Applications and Benchmarks

by Xiong, Neal N. , Guo, Wenzhong , Dai, Yuanfei

2020

A knowledge graph (KG), also known as a knowledge base, is a particular kind of network structure in which the node indicates entity and the edge represent relation. However, with the explosion of network volume, the problem of data sparsity that causes large-scale KG systems to calculate and manage difficultly has become more significant. For alleviating the issue, knowledge graph embedding is proposed to embed entities and relations in a KG to a low-, dense and continuous feature space, and endow the yield model with abilities of knowledge inference and fusion. In recent years, many researchers have poured much attention in this approach, and we will systematically introduce the existing state-of-the-art approaches and a variety of applications that benefit from these methods in this paper. In addition, we discuss future prospects for the development of techniques and application trends. Specifically, we first introduce the embedding models that only leverage the information of observed triplets in the KG. We illustrate the overall framework and specific idea and compare the advantages and disadvantages of such approaches. Next, we introduce the advanced models that utilize additional semantic information to improve the performance of the original methods. We divide the additional information into two categories, including textual descriptions and relation paths. The extension approaches in each category are described, following the same classification criteria as those defined for the triplet fact-based models. We then describe two experiments for comparing the performance of listed methods and mention some broader domain tasks such as question answering, recommender systems, and so forth. Finally, we collect several hurdles that need to be overcome and provide a few future research directions for knowledge graph embedding.

Journal Article

Share this book

Add to My Shelf

Improving embedded knowledge graph multi-hop question answering by introducing relational chain reasoning

by Zhao, Biao , Jin, Weiqiang , Liu, Guizhong in Ablation , Actors , Algorithms

2023

Knowledge Graph Question Answering (KGQA) aims to answer user-questions from a knowledge graph (KG) by identifying the reasoning relations between topic entity and answer. As a complex branch task of KGQA, multi-hop KGQA requires reasoning over the multi-hop relational chain preserved in KG to arrive at the right answer. Despite recent successes, the existing works on answering multi-hop complex questions still face the following challenges: (i) The absence of an explicit relational chain order reflected in user-question stems from a misunderstanding of a user’s intentions. (ii) Incorrectly capturing relational types on weak supervision of which dataset lacks intermediate reasoning chain annotations due to expensive labeling cost. (iii) Failing to consider implicit relations between the topic entity and the answer implied in structured KG because of limited neighborhoods size constraint in subgraph retrieval-based algorithms. To address these issues in multi-hop KGQA, we propose a novel model herein, namely Relational Chain based Embedded KGQA (Rce-KGQA), which simultaneously utilizes the explicit relational chain revealed in natural language question and the implicit relational chain stored in structured KG. Our extensive empirical study on three open-domain benchmarks proves that our method significantly outperforms the state-of-the-art counterparts like GraftNet, PullNet and EmbedKGQA. Comprehensive ablation experiments also verify the effectiveness of our method on the multi-hop KGQA task. We have made our model’s source code available at github: https://github.com/albert-jin/Rce-KGQA.

Journal Article

Share this book

Add to My Shelf

Advancing drug–target interaction prediction: a comprehensive graph-based approach integrating knowledge graph embedding and ProtBert pretraining

by Ben Yahia, Sadok , Diallo, Gayo , Hermi, Khalil in Algorithms , Amino acids , Analysis

2023

Background The pharmaceutical field faces a significant challenge in validating drug target interactions (DTIs) due to the time and cost involved, leading to only a fraction being experimentally verified. To expedite drug discovery, accurate computational methods are essential for predicting potential interactions. Recently, machine learning techniques, particularly graph-based methods, have gained prominence. These methods utilize networks of drugs and targets, employing knowledge graph embedding (KGE) to represent structured information from knowledge graphs in a continuous vector space. This phenomenon highlights the growing inclination to utilize graph topologies as a means to improve the precision of predicting DTIs, hence addressing the pressing requirement for effective computational methodologies in the field of drug discovery. Results The present study presents a novel approach called DTIOG for the prediction of DTIs. The methodology employed in this study involves the utilization of a KGE strategy, together with the incorporation of contextual information obtained from protein sequences. More specifically, the study makes use of Protein Bidirectional Encoder Representations from Transformers (ProtBERT) for this purpose. DTIOG utilizes a two-step process to compute embedding vectors using KGE techniques. Additionally, it employs ProtBERT to determine target–target similarity. Different similarity measures, such as Cosine similarity or Euclidean distance, are utilized in the prediction procedure. In addition to the contextual embedding, the proposed unique approach incorporates local representations obtained from the Simplified Molecular Input Line Entry Specification (SMILES) of drugs and the amino acid sequences of protein targets. Conclusions The effectiveness of the proposed approach was assessed through extensive experimentation on datasets pertaining to Enzymes, Ion Channels, and G-protein-coupled Receptors. The remarkable efficacy of DTIOG was showcased through the utilization of diverse similarity measures in order to calculate the similarities between drugs and targets. The combination of these factors, along with the incorporation of various classifiers, enabled the model to outperform existing algorithms in its ability to predict DTIs. The consistent observation of this advantage across all datasets underlines the robustness and accuracy of DTIOG in the domain of DTIs. Additionally, our case study suggests that the DTIOG can serve as a valuable tool for discovering new DTIs.

Journal Article

Share this book

Add to My Shelf

Knowledge graph embedding methods for entity alignment: experimental review

by Christophides, Vassilis , Fanourakis, Nikolaos , Efthymiou, Vasilis in Alignment , Effectiveness , Efficiency

2023

In recent years, we have witnessed the proliferation of knowledge graphs (KG) in various domains, aiming to support applications like question answering, recommendations, etc. A frequent task when integrating knowledge from different KGs is to find which subgraphs refer to the same real-world entity, a task largely known as the Entity Alignment. Recently, embedding methods have been used for entity alignment tasks, that learn a vector-space representation of entities which preserves their similarity in the original KGs. A wide variety of supervised, unsupervised, and semi-supervised methods have been proposed that exploit both factual (attribute based) and structural information (relation based) of entities in the KGs. Still, a quantitative assessment of their strengths and weaknesses in real-world KGs according to different performance metrics and KG characteristics is missing from the literature. In this work, we conduct the first meta-level analysis of popular embedding methods for entity alignment, based on a statistically sound methodology. Our analysis reveals statistically significant correlations of different embedding methods with various meta-features extracted by KGs and rank them in a statistically significant way according to their effectiveness across all real-world KGs of our testbed. Finally, we study interesting trade-offs in terms of methods’ effectiveness and efficiency.

Journal Article

Share this book

Add to My Shelf

Relational data embeddings for feature enrichment with background information

by Allauzen, Alexandre , Varoquaux, Gaël , Cvetkov-Iliev, Alexis in Artificial Intelligence , Cities , Cognitive tasks

2023

For many machine-learning tasks, augmenting the data table at hand with features built from external sources is key to improving performance. For instance, estimating housing prices benefits from background information on the location, such as the population density or the average income. However, this information must often be assembled across many tables, requiring time and expertise from the data scientist. Instead, we propose to replace human-crafted features by vectorial representations of entities (e.g. cities) that capture the corresponding information. We represent the relational data on the entities as a graph and adapt graph-embedding methods to create feature vectors for each entity. We show that two technical ingredients are crucial: modeling well the different relationships between entities, and capturing numerical attributes. We adapt knowledge graph embedding methods that were primarily designed for graph completion. Yet, they model only discrete entities, while creating good feature vectors from relational data also requires capturing numerical attributes. For this, we introduce KEN: Knowledge Embedding with Numbers. We thoroughly evaluate approaches to enrich features with background information on 7 prediction tasks. We show that a good embedding model coupled with KEN can perform better than manually handcrafted features, while requiring much less human effort. It is also competitive with combinatorial feature engineering methods, but much more scalable. Our approach can be applied to huge databases, creating general-purpose feature vectors reusable in various downstream tasks.

Journal Article

Share this book

Add to My Shelf

Improving temporal knowledge graph embedding using tensor factorization

by Wei, Jianghong , Zhang, Mengli , He, Peng in Antisymmetry , Complexity , Decomposition

2023

The approach of knowledge graph embedding (KGE) enables it possible to represent facts of a knowledge graph (KG) in low-dimensional continuous vector spaces. Consequently, it can significantly reduce the complexity of those operations performed on the underlying KG, and has attracted a lot of attention in recent years. However, most of KGE approaches have only been developed over static facts and ignore the time attribute. As a matter of effect, in some real-world KGs, a fact might only be valid for a specific time interval or point in time. For instance, the fact (Barack Obama, is president of, US, [2009-2017]) is only valid between 2009 and 2017. To conquer this issue, based on a famous tensor factorization approach, canonical polyadic decomposition, we propose two new temporal KGE models called TSimplE and TNTSimplE that integrates time information besides static facts. A non-temporal component is also added to deal with heterogeneous temporal KGs that include both temporal and non-temporal relations. We prove that the proposed models are fully expressive which has a bound on the dimensionality of their embeddings, and can incorporate several important types of background knowledge including symmetry, antisymmetry and inversion. In addition, our models are capable of dealing with two common challenges in real-world temporal KGs, i.e., modeling time intervals and predicting time for facts with missing time information. We conduct extensive experiments on three real-world temporal KGs: ICEWS, YAGO3 and Wikidata. The results indicate that our models achieve start-of-the-art performance with lower time or space complexity.

Journal Article

Share this book

Add to My Shelf

Developing a BERT based triple classification model using knowledge graph embedding for question answering system

by Do Phuc , Phan, Truong H in Algorithms , Classification , Embedding

2022

The current BERT-based question answering systems use a question and a contextual text to find the answer. This causes the systems to return wrong answers or nothing if the text contains irrelevant contents with the input question. Besides, the systems haven’t answered yes-no and aggregate questions yet. Besides that, the systems only concentrate on the contents of text regardless of the relationship between entities in the corpus. This systems cannot validate the answer. In this paper, we presented a solution to solve these issues by using the BERT model and the knowledge graph to enhance a question answering system. We combined content-based and linked-based information for knowledge graph representation learning and classified triples into one of three classes such as base class, derived class, or non-existent class. We then used the BERT model to build two classifiers: BERT-based text classification for content information and BERT-based triple classification for link information. The former was able to make a contextual embedding vector for representing triples that were used to classify into the three above classes. The latter generated all path instances from all meta paths of a large heterogeneous information network by running the Motif Search method of Apache Spark on a distributed environment. After creating the path instances, we produced triples from these path instances. We made content-based information by converting triples into natural language text with labels and considered them as a text classification problem. Our proposed solution outperformed other embedding methods with an average accuracy of 92.34% on benchmark datasets and the Motif Finding algorithm with an average executive time improvement of 37% on the distributed environment.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter