Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
4,247
result(s) for
"semantic information mining"
Sort by:
Ontology reasoning scheme for constructing meaningful sports video summarisation
2013
As digital sports video becomes increasingly pervasive, semantic video summary becomes one of the important components for the next generation of multimedia applications. Ontology is a feasible way to mine the semantic information from the video stream. However, current ontology-based methods did not concentrate on the effectiveness and soundness of semantic reasoning. Here, the authors propose a content-directed ontology reasoning approach to produce meaningful sports video summarisation. The proposed ontology can facilitate the metadata acquisition of video and the improvement of query performance. It also provides a flexible way to query the sports video database, which cannot be achieved by simple keyword search. For annotating, describing and managing the sports video content, we propose a sports video descriptive language (SVDL) based on the proposed ontology. Moreover, the semantically meaningful sports video abstraction is produced by reasoning engine which is based on the extension of the Tableau algorithm. Meanwhile, the soundness and completeness of the reasoning algorithm can be solidly proved. Subjective assessment experimental results reveal the reliability and efficiency of the propose scheme.
Journal Article
Semantic association computation: a comprehensive survey
by
Gao Xiaoying
,
Jabeen Shahida
,
Andreae, Peter
in
Artificial intelligence
,
Associations
,
Categories
2020
Semantic association computation is the process of quantifying the strength of a semantic connection between two textual units, based on different types of semantic relations. Semantic association computation is a key component of various applications belonging to a multitude of fields, such as computational linguistics, cognitive psychology, information retrieval and artificial intelligence. The field of semantic association computation has been studied for decades. The aim of this paper is to present a comprehensive survey of various approaches for computing semantic associations, categorized according to their underlying sources of background knowledge. Existing surveys on semantic computation have focused on a specific aspect of semantic associations, such as utilizing distributional semantics in association computation or types of spatial models of semantic associations. However, this paper has put a multitude of computational aspects and factors in one picture. This makes the article worth reading for those researchers who want to start off in the field of semantic associations computation. This paper introduces the fundamental elements of the association computation process, evaluation methodologies and pervasiveness of semantic measures in a variety of fields, relying on natural language semantics. Along the way, there is a detailed discussion on the main categories of background knowledge sources, classified as formal and informal knowledge sources, and the underlying design models, such as spatial, combinatorial and network models, that are used in the association computation process. The paper classifies existing approaches of semantic association computation into two broad categories, based on their utilization of background knowledge sources: knowledge-rich approaches; and knowledge-lean approaches. Each category is divided further into sub-categories, according to the type of underlying knowledge sources and design models of semantic association. A comparative analysis of strengths and limitations of various approaches belonging to each research stream is also presented. The paper concludes the survey by analyzing the pivotal factors that affect the performance of semantic association measures.
Journal Article
Improving embedded knowledge graph multi-hop question answering by introducing relational chain reasoning
2023
Knowledge Graph Question Answering (KGQA) aims to answer user-questions from a knowledge graph (KG) by identifying the reasoning relations between topic entity and answer. As a complex branch task of KGQA, multi-hop KGQA requires reasoning over the multi-hop relational chain preserved in KG to arrive at the right answer. Despite recent successes, the existing works on answering multi-hop complex questions still face the following challenges: (i) The absence of an explicit relational chain order reflected in user-question stems from a misunderstanding of a user’s intentions. (ii) Incorrectly capturing relational types on weak supervision of which dataset lacks intermediate reasoning chain annotations due to expensive labeling cost. (iii) Failing to consider implicit relations between the topic entity and the answer implied in structured KG because of limited neighborhoods size constraint in subgraph retrieval-based algorithms. To address these issues in multi-hop KGQA, we propose a novel model herein, namely Relational Chain based Embedded KGQA (Rce-KGQA), which simultaneously utilizes the explicit relational chain revealed in natural language question and the implicit relational chain stored in structured KG. Our extensive empirical study on three open-domain benchmarks proves that our method significantly outperforms the state-of-the-art counterparts like GraftNet, PullNet and EmbedKGQA. Comprehensive ablation experiments also verify the effectiveness of our method on the multi-hop KGQA task. We have made our model’s source code available at github: https://github.com/albert-jin/Rce-KGQA.
Journal Article
Topic Modeling: A Comprehensive Review
2020
Topic modelling is the new revolution in text mining. It is a statistical technique for revealing the underlying semantic structure in large collection of documents. After analysing approximately 300 research articles on topic modeling, a comprehensive survey on topic modelling has been presented in this paper. It includes classification hierarchy, Topic modelling methods, Posterior Inference techniques, different evolution models of latent Dirichlet allocation (LDA) and its applications in different areas of technology including Scientific Literature, Bioinformatics, Software Engineering and analysing social network is presented. Quantitative evaluation of topic modeling techniques is also presented in detail for better understanding the concept of topic modeling. At the end paper is concluded with detailed discussion on challenges of topic modelling, which will definitely give researchers an insight for good research.
Journal Article
Twitter mining for ontology-based domain discovery incorporating machine learning
by
Yan Kit, Chan
,
Abu-Salih, Bilal
,
Wongthongtham, Pornpit
in
Algorithms
,
Artificial intelligence
,
Attitudes
2018
Purpose
This paper aims to obtain the domain of the textual content generated by users of online social network (OSN) platforms. Understanding a users’ domain (s) of interest is a significant step towards addressing their domain-based trustworthiness through an accurate understanding of their content in their OSNs.
Design/methodology/approach
This study uses a Twitter mining approach for domain-based classification of users and their textual content. The proposed approach incorporates machine learning modules. The approach comprises two analysis phases: the time-aware semantic analysis of users’ historical content incorporating five commonly used machine learning classifiers. This framework classifies users into two main categories: politics-related and non-politics-related categories. In the second stage, the likelihood predictions obtained in the first phase will be used to predict the domain of future users’ tweets.
Findings
Experiments have been conducted to validate the mechanism proposed in the study framework, further supported by the excellent performance of the harnessed evaluation metrics. The experiments conducted verify the applicability of the framework to an effective domain-based classification for Twitter users and their content, as evident in the outstanding results of several performance evaluation metrics.
Research limitations/implications
This study is limited to an on/off domain classification for content of OSNs. Hence, we have selected a politics domain because of Twitter’s popularity as an opulent source of political deliberations. Such data abundance facilitates data aggregation and improves the results of the data analysis. Furthermore, the currently implemented machine learning approaches assume that uncertainty and incompleteness do not affect the accuracy of the Twitter classification. In fact, data uncertainty and incompleteness may exist. In the future, the authors will formulate the data uncertainty and incompleteness into fuzzy numbers which can be used to address imprecise, uncertain and vague data.
Practical implications
This study proposes a practical framework comprising significant implications for a variety of business-related applications, such as the voice of customer/voice of market, recommendation systems, the discovery of domain-based influencers and opinion mining through tracking and simulation. In particular, the factual grasp of the domains of interest extracted at the user level or post level enhances the customer-to-business engagement. This contributes to an accurate analysis of customer reviews and opinions to improve brand loyalty, customer service, etc.
Originality/value
This paper fills a gap in the existing literature by presenting a consolidated framework for Twitter mining that aims to uncover the deficiency of the current state-of-the-art approaches to topic distillation and domain discovery. The overall approach is promising in the fortification of Twitter mining towards a better understanding of users’ domains of interest.
Journal Article
A survey on instance segmentation: state of the art
2020
Object detection or localization is an incremental step in progression from coarse to fine digital image inference. It not only provides the classes of the image objects, but also provides the location of the image objects which have been classified. The location is given in the form of bounding boxes or centroids. Semantic segmentation gives fine inference by predicting labels for every pixel in the input image. Each pixel is labelled according to the object class within which it is enclosed. Furthering this evolution, instance segmentation gives different labels for separate instances of objects belonging to the same class. Hence, instance segmentation may be defined as the technique of simultaneously solving the problem of object detection as well as that of semantic segmentation. In this survey paper on instance segmentation, its background, issues, techniques, evolution, popular datasets, related work up to the state of the art and future scope have been discussed. The paper provides valuable information for those who want to do research in the field of instance segmentation.
Journal Article
Core techniques of question answering systems over knowledge bases: a survey
by
Maret, Pierre
,
Diefenbach, Dennis
,
Singh, Kamal
in
Benchmarks
,
Information
,
Information retrieval
2018
The Semantic Web contains an enormous amount of information in the form of knowledge bases (KB). To make this information available, many question answering (QA) systems over KBs were created in the last years. Building a QA system over KBs is difficult because there are many different challenges to be solved. In order to address these challenges, QA systems generally combine techniques from natural language processing, information retrieval, machine learning and Semantic Web. The aim of this survey is to give an overview of the techniques used in current QA systems over KBs. We present the techniques used by the QA systems which were evaluated on a popular series of benchmarks: Question Answering over Linked Data. Techniques that solve the same task are first grouped together and then described. The advantages and disadvantages are discussed for each technique. This allows a direct comparison of similar techniques. Additionally, we point to techniques that are used over WebQuestions and SimpleQuestions, which are two other popular benchmarks for QA systems.
Journal Article
Information extraction pipelines for knowledge graphs
2023
In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community’s disjoint efforts on KG completion. We include more components into the architecture of Plumber to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations.
Journal Article
Automatic ontology construction from text: a review from shallow to deep learning trend
2020
The explosive growth of textual data on the web coupled with the increase on demand for ontologies to promote the semantic web, have made the automatic ontology construction from the text a very promising research area. Ontology learning (OL) from text is a process that aims to (semi-) automatically extract and represent the knowledge from text in machine-readable form. Ontology is considered one of the main cornerstones of representing the knowledge in a more meaningful way on the semantic web. Usage of ontologies has proven to be beneficial and efficient in different applications (e.g. information retrieval, information extraction, and question answering). Nevertheless, manually construction of ontologies is time-consuming as well extremely laborious and costly process. In recent years, many approaches and systems that try to automate the construction of ontologies have been developed. This paper reviews various approaches, systems, and challenges of automatic ontology construction from the text. In addition, it also discusses ways the ontology construction process could be enhanced in the future by presenting techniques from shallow learning to deep learning (DL).
Journal Article
Text classification algorithm of tourist attractions subcategories with modified TF-IDF and Word2Vec
2024
Text classification, as an important research area of text mining, can quickly and effectively extract valuable information to address the challenges of organizing and managing large-scale text data in the era of big data. Currently, the related research on text classification tends to focus on the application in fields such as information filtering, information retrieval, public opinion monitoring, and library and information, with few studies applying text classification methods to the field of tourist attractions. In light of this, a corpus of tourist attraction description texts is constructed using web crawler technology in this paper. We propose a novel text representation method that combines Word2Vec word embeddings with TF-IDF-CRF-POS weighting, optimizing traditional TF-IDF by incorporating total relative term frequency, category discriminability, and part-of-speech information. Subsequently, the proposed algorithm respectively combines seven commonly used classifiers (DT, SVM, LR, NB, MLP, RF, and KNN), known for their good performance, to achieve multi-class text classification for six subcategories of national A-level tourist attractions. The effectiveness and superiority of this algorithm are validated by comparing the overall performance, specific category performance, and model stability against several commonly used text representation methods. The results demonstrate that the newly proposed algorithm achieves higher accuracy and F1-measure on this type of professional dataset, and even outperforms the high-performance BERT classification model currently favored by the industry. Acc, marco-F1, and mirco-F1 values are respectively 2.29%, 5.55%, and 2.90% higher. Moreover, the algorithm can identify rare categories in the imbalanced dataset and exhibit better stability across datasets of different sizes. Overall, the algorithm presented in this paper exhibits superior classification performance and robustness. In addition, the conclusions obtained by the predicted value and the true value are consistent, indicating that this algorithm is practical. The professional domain text dataset used in this paper poses higher challenges due to its complexity (uneven text length, relatively imbalanced categories), and a high degree of similarity between categories. However, this proposed algorithm can efficiently implement the classification of multiple subcategories of this type of text set, which is a beneficial exploration of the application research of complex Chinese text datasets in specific fields, and provides a useful reference for the vector expression and classification of text datasets with similar content.
Journal Article