Catalogue Search | MBRL

A Survey on Multimodal Knowledge Graphs: Construction, Completion and Applications

by Hu, Linmei , Chen, Yong , Zhang, Jinwen in Artificial intelligence , Concrete construction , Datasets

2023

As an essential part of artificial intelligence, a knowledge graph describes the real-world entities, concepts and their various semantic relationships in a structured way and has been gradually popularized in a variety practical scenarios. The majority of existing knowledge graphs mainly concentrate on organizing and managing textual knowledge in a structured representation, while paying little attention to the multimodal resources (e.g., pictures and videos), which can serve as the foundation for the machine perception of a real-world data scenario. To this end, in this survey, we comprehensively review the related advances of multimodal knowledge graphs, covering multimodal knowledge graph construction, completion and typical applications. For construction, we outline the methods of named entity recognition, relation extraction and event extraction. For completion, we discuss the multimodal knowledge graph representation learning and entity linking. Finally, the mainstream applications of multimodal knowledge graphs in miscellaneous domains are summarized.

Journal Article

Share this book

Add to My Shelf

RoEMF: rotational embedding multimodal fusion for link prediction

by Zhang, Wenqi , Lu, Shiteng , Li, Xinqiang in Colleges & universities , Computer Science , Data integration

2025

Multimodal link prediction aims to identify missing head and tail entities in the relational triples of multimodal knowledge graphs. However, each modality contains distinct information, and how to effectively fuse multimodal data has become a complex challenge. To address this issue, the rotational embedding multimodal fusion (RoEMF) model was proposed based on rotary position encoding (RoPE). The model employs a multi-head cross-attention mechanism, combined with RoPE, to enhance the representation of positional and contextual information, thereby improving the multimodal data fusion. It focuses on integrating information from different subspaces while capturing cross-modal correlations to mitigate potential data loss, enhances feature fusion, and optimizes the heterogeneity of the representation. Additionally, the cross-modal joint decision loss was proposed to reduce the model’s reliance on single-modal data, aiding in the identification of missing head and tail entities, while enhancing the accuracy and generalization ability of multimodal link prediction. Experimental results on three public MMKG benchmarks demonstrate the outstanding performance of RoEMF compared with other methods in link prediction.

Journal Article

Share this book

Add to My Shelf

BioBLP: a modular framework for learning on multimodal biomedical knowledge graphs

by Groth, Paul , Mitra, Payal , Pijnenburg, Thom in Algorithms , Amino acids , Analysis

2023

Background Knowledge graphs (KGs) are an important tool for representing complex relationships between entities in the biomedical domain. Several methods have been proposed for learning embeddings that can be used to predict new links in such graphs. Some methods ignore valuable attribute data associated with entities in biomedical KGs, such as protein sequences, or molecular graphs. Other works incorporate such data, but assume that entities can be represented with the same data modality. This is not always the case for biomedical KGs, where entities exhibit heterogeneous modalities that are central to their representation in the subject domain. Objective We aim to understand how to incorporate multimodal data into biomedical KG embeddings, and analyze the resulting performance in comparison with traditional methods. We propose a modular framework for learning embeddings in KGs with entity attributes, that allows encoding attribute data of different modalities while also supporting entities with missing attributes. We additionally propose an efficient pretraining strategy for reducing the required training runtime. We train models using a biomedical KG containing approximately 2 million triples, and evaluate the performance of the resulting entity embeddings on the tasks of link prediction, and drug-protein interaction prediction, comparing against methods that do not take attribute data into account. Results In the standard link prediction evaluation, the proposed method results in competitive, yet lower performance than baselines that do not use attribute data. When evaluated in the task of drug-protein interaction prediction, the method compares favorably with the baselines. Further analyses show that incorporating attribute data does outperform baselines over entities below a certain node degree, comprising approximately 75% of the diseases in the graph. We also observe that optimizing attribute encoders is a challenging task that increases optimization costs. Our proposed pretraining strategy yields significantly higher performance while reducing the required training runtime. Conclusion BioBLP allows to investigate different ways of incorporating multimodal biomedical data for learning representations in KGs. With a particular implementation, we find that incorporating attribute data does not consistently outperform baselines, but improvements are obtained on a comparatively large subset of entities below a specific node-degree. Our results indicate a potential for improved performance in scientific discovery tasks where understudied areas of the KG would benefit from link prediction methods.

Journal Article

Share this book

Add to My Shelf

Construction of a Semantic Network for International Chinese Language Education Based on Knowledge Graph Technology and Optimization of Its Teaching Resources

by Liang, Yingping , Han, Xiaoyun in 97B20 , Accuracy , Chinese language education

2025

In the era of big data, technology is changing rapidly, Chinese education resources supported by the Internet are growing exponentially, and many problems such as confusing conceptual elaboration and unclear relationship with everyday language can occur in the process of international Chinese education promotion and textbook use. Therefore, based on the advantage that knowledge graph can describe the relationship between words or statements concisely, efficiently and quickly, this paper proposes a knowledge enhancement and cue tuning embedding model based on multimodal knowledge graph embedding technology by utilizing text and image extraction to supplement the knowledge graph based on the graphic needs of Chinese education teaching. On this basis, a knowledge semantic network for Chinese education is constructed. Through experimental analysis, this paper after knowledge enhancement for text entity extraction task effect enhancement help more. This paper’s model got the best results for each metric compared to other benchmark models, and in large sample scenarios, this paper’s model is better, with precision, recall, and F1 values of 87.42%, 88.03, and 87.23 on the CTec2020 dataset, respectively. Meanwhile, in the cross-task scenario test, the F1 value of this paper’s model is 78.8 and 76.61 on CTec2018 and CTec2020, respectively, with the optimal results for each evaluation index, which further proves the performance of this paper’s model. The number of entities extracted from the semantic network of this paper for various words in Chinese education is more than 3,000, the average word length of the five corpus is 2.994, and the accuracy rate of entity extraction ranges from 91.05% to 97.66%, with the highest points of 98, 221, 132, 159, and 147, respectively, which is a great advantage in all aspects, showing that the semantic network constructed for Chinese education in this paper has a great advantage in accuracy, comprehensiveness, and domain. It shows that the semantic network constructed in this paper for Chinese language education has improved in accuracy, comprehensiveness and domain, and the quality of entities is better. This research knowledge semantic network can be practically applied in the field of Chinese education, which provides scientific support for the optimization of Chinese educational resources.

Journal Article

Share this book

Add to My Shelf

UMEAD: Unsupervised Multimodal Entity Alignment for Equipment Knowledge Graphs via Dual-Space Embedding

by Li, Ning , Wang, Jingbo , Liu, Xiulei in Alignment , Constraints , Embedding

2025

The symmetry between different representation spaces plays a crucial role in effectively modeling complex multimodal data. To address the challenge of equipment knowledge graphs containing hierarchical relationships that cannot be fully represented in a single space, this study proposes UMEAD, an unsupervised multimodal entity alignment method based on dual-space embeddings. The method simultaneously learns graph embeddings in both Euclidean and hyperbolic spaces, forming a structural symmetry where the Euclidean space captures local regularities and the hyperbolic space models global hierarchies. Their complementarity achieves a balanced and symmetric representation of multimodal knowledge. An adaptive feature fusion strategy is further employed to dynamically weight semantic and visual modalities, enhancing the symmetry and complementarity between different modalities. To reduce reliance on scarce pre-aligned data, pseudo seed instances are generated from multimodal features, and an iterative constraint mechanism progressively enlarges the training set, enabling unsupervised alignment. Experiments on public datasets, including EMMEAD, FB15K-DB15K, and FB15K-YAGO15K, demonstrate that the combination of dual-space embeddings, adaptive fusion, and iterative constraints significantly improves alignment accuracy. In summary, the proposed method reduces dependence on pre-aligned data, strengthens multimodal and structural alignment, and its symmetric embedding and fusion design offers a promising approach for the construction and application of multimodal knowledge graphs in the equipment domain.

Journal Article

Share this book

Add to My Shelf

Multimodal Temporal Knowledge Graph Embedding Method Based on Mixture of Experts for Recommendation

by Dong, Guangyuan , Li, Changzhi , Fang, Yuanyuan in Accuracy , Customization , Embedding

2025

Knowledge-graph-based recommendation aims to provide personalized recommendation services to users based on their historical interaction information, which is of great significance for shopping transaction rates and other aspects. With the rapid growth of online shopping, the knowledge graph constructed from users’ historical interaction data now incorporates multiattribute information, including timestamps, images, and textual content. The information of multiple modalities is difficult to effectively utilize due to their different representation structures and spaces. The existing methods attempt to utilize the above information through simple embedding representation and aggregation, but ignore targeted representation learning for information with different attributes and learning effective weights for aggregation. In addition, existing methods are not sufficient for effectively modeling temporal information. In this article, we propose MTR, a knowledge graph recommendation framework based on mixture of experts network. To achieve this goal, we use a mixture-of-experts network to learn targeted representations and weights of different product attributes for effective modeling and utilization. In addition, we effectively model the temporal information during the user shopping process. A thorough experimental study on popular benchmarks validates that MTR can achieve competitive results.

Journal Article

Share this book

Add to My Shelf

CICHMKG: a large-scale and comprehensive Chinese intangible cultural heritage multimodal knowledge graph

by Fan, Tao , Hodel, Tobias , Wang, Hao in Algorithms , Cultural heritage , Cultural resources

2023

Intangible Cultural Heritage (ICH) witnesses human creativity and wisdom in long histories, composed of a variety of immaterial manifestations. The rapid development of digital technologies accelerates the record of ICH, generating a sheer number of heterogenous data but in a state of fragmentation. To resolve that, existing studies mainly adopt approaches of knowledge graphs (KGs) which can provide rich knowledge representation. However, most KGs are text-based and text-derived, and incapable to give related images and empower downstream multimodal tasks, which is also unbeneficial for the public to establish the visual perception and comprehend ICH completely especially when they do not have the related ICH knowledge. Hence, aimed at that, we propose to, taking the Chinese nation-level ICH list as an example, construct a large-scale and comprehensive Multimodal Knowledge Graph (CICHMKG) combining text and image entities from multiple data sources and give a practical construction framework. Additionally, in this paper, to select representative images for ICH entities, we propose a method composed of the denoising algorithm (CNIFA) and a series of criteria, utilizing global and local visual features of images and textual features of captions. Extensive empirical experiments demonstrate its effectiveness. Lastly, we construct the CICHMKG, consisting of 1,774,005 triples, and visualize it to facilitate the interactions and help the public dive into ICH deeply.

Journal Article

Share this book

Add to My Shelf

Intelligent analysis and recommendation of educational content in network engineering: a study on teaching effectiveness evaluation model based on knowledge reasoning and multimodal knowledge graph

by Zhao, Lihong in Accuracy , Adaptability , Artificial Intelligence

2026

With the rapid development of information technology, network engineering education is gradually moving towards intelligence, but existing teaching content analysis and recommendation systems still have limitations in personalized evaluation. Traditional methods mostly rely on rule-driven or single modal data, resulting in incomplete construction of the Knowledge Graph (KG), low accuracy of content recommendation, and insufficient coverage of knowledge points. In response to these issues, this article combined Knowledge Reasoning (KR) and Multimodal KG with Graph Neural Network (GNN) to construct a Knowledge Reasoning and Multimodal Knowledge Graph-based Graph Neural Network (KRMKG-GNN) intelligent teaching content analysis and recommendation model based on KR and Multimodal KG. This model constructs a multimodal KG that integrates text, image, and user behavior data through GNN, and then uses KR module to dynamically evaluate students’ knowledge status and generate personalized recommendations. The experiment was validated on the learning data of 500 students, and the results showed that the model’s recommendation accuracy reached 92.7%. Compared with Graph Attention Network (GAT), the knowledge point coverage has increased by 18.5% and the error rate has decreased by 1.6. Through deep integration and reasoning analysis of multidimensional data, this model effectively enhances the intelligence level of teaching effectiveness evaluation and has broad application potential.

Journal Article

Share this book

Add to My Shelf

WuMKG: a Chinese painting and calligraphy multimodal knowledge graph

by Zou, Ao , Chen, Yubin , Wang, Qiya in Calligraphy , Data collection , Graphical representations

2024

Chinese Painting and Calligraphy (ChP&C) holds significant cultural value, representing integral aspects of both Chinese culture and global art. A considerable amount of ChP&C works are dispersed worldwide. With the emergence of digital humanities, a vast collection of cultural artifacts data is now available online. However, the online databases of these artifacts remain decentralized and diverse, posing significant challenges to their effective organization and utilization. Addressing this, our paper focuses on the Wu Men School of Painting and proposes a framework for constructing a multimodal knowledge graph for the ChP&C domain. We construct the domain ontology by analyzing the ChP&C knowledge schema. Then, we acquire knowledge from diverse data sources, including textual and visual information. To enhance data collection around collecting historical context and subject matter, we propose methods for seal extraction and subject extraction specific to ChP&C. We validate the effectiveness of these methods on the constructed dataset. Finally, we construct the Wu Men Multimodal Knowledge Graph (WuMKG) and implement applications such as cross-modal retrieval, knowledge-based question-answering and visualization.

Journal Article

Share this book

Add to My Shelf

A Knowledge Graph-Based Approach for Assembly Sequence Recommendations for Wind Turbines

by Zhou, Bin , Bao, Jinsong , Li, Xinyu in Air-turbines , Algorithms , Artificial intelligence

2023

There are various forms of assembly data sources for wind turbines, which contributes to the lack of a unified and standardized expression. Moreover, the reusability of historical assembly data is low, which leads to the poor reasoning ability of a new product assembly sequence. In this paper, we propose a knowledge graph-based approach for assembly sequence recommendations for wind turbines. First, for the multimodal data (text in process manual, image of tooling, and three-dimensional (3D) model) of assembly, a multi-process assembly information representation model is established to express assembly elements in a unified way. In addition, knowledge extraction methods for different modal data are designed to construct a multimodal knowledge graph for wind turbine assembly. Further, the retrieval of similar assembly process items based on the bidirectional encoder representation from transformers joint graph-matching network (BERT-GMN) is proposed to predict the assembly sequence subgraphs. Also, a Semantic Web Rule Language (SWRL)-based assembly process items inference method is proposed to automatically generate subassembly sequences by combining component assembly relationships. Then, a multi-objective sequence optimization algorithm for the final assembly is designed to output the optimal assembly sequences. Finally, taking the VEU-15 wind turbine as the object, the effectiveness of the assembly process information modeling and part multi-source information representation is verified. Sequence recommendation results are better quality compared to traditional assembly sequence planning algorithms. It provides a feasible solution for wind turbine assembly to be optimized from multiple objectives simultaneously.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter