Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
4,237
result(s) for
"semantic search"
Sort by:
Information extraction pipelines for knowledge graphs
2023
In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community’s disjoint efforts on KG completion. We include more components into the architecture of Plumber to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations.
Journal Article
Bridging Digital Literacy Gaps With AI-Driven Semantic Search Technologies
by
Xie, Hong
,
Hu, Bin
,
Malik, Ifrah
in
Artificial intelligence
,
Completion time
,
Digital literacy
2025
This study examined how digital literacy influences user interactions with artificial intelligence–driven semantic search engines compared with traditional keyword-based search systems. The authors assessed whether an artificial intelligence–driven search enhances efficiency, query quality, and user satisfaction across varying digital literacy levels, in particular complex information retrieval tasks. Sixty participants, categorized into three digital literacy groups (beginner, intermediate, and advanced) on the basis of the European Commission's Digital Competence Framework, completed six search tasks (three simple, three complex) using both traditional and artificial intelligence–driven search engines. Performance was measured by task completion time, query quality, and user satisfaction. Statistical analyses (analysis of variance, paired t tests) were conducted to compare outcomes across literacy levels and search engine types. Post-task interviews provided qualitative insights into user experiences.
Journal Article
Ontology-based semantic search on the Web and its combination with the power of inductive reasoning
by
Fazzinga, Bettina
,
Gottlob, Georg
,
Lukasiewicz, Thomas
in
Artificial Intelligence
,
Complex Systems
,
Computer Science
2012
Semantic Web search is currently one of the hottest research topics in both Web search and the Semantic Web. In previous work, we have presented a novel approach to Semantic Web search, which allows for evaluating ontology-based complex queries that involve reasoning over the Web relative to an underlying background ontology. We have developed the formal model behind this approach, and provided a technique for processing Semantic Web search queries, which consists of an offline ontological inference step and an online reduction to standard Web search. In this paper, we continue this line of research. We further enhance the above approach by the use of inductive rather than deductive reasoning in the offline inference step. This increases the robustness of Semantic Web search, as it adds the important ability to handle inconsistencies, noise, and incompleteness, which are all very likely to occur in distributed and heterogeneous environments such as the Web. The inductive variant also allows to infer new (not logically deducible) knowledge (from training individuals). We report on a prototype implementation of (both the deductive and) the inductive variant of our approach in desktop search, and we provide extensive new experimental results, especially on the running time and the precision and the recall of our new approach.
Journal Article
Enhancing User Query Comprehension and Contextual Relevance with a Semantic Search Engine using BERT and ElasticSearch
by
Goudar, R.H
,
Rathod, Vijayalaxmi
,
Kulkarni, Anjanabhargavi
in
Natural language processing
,
Queries
,
Search engines
2024
This research paper explores the development of a semantic search engine designed to enhance user query comprehension and deliver contextually applicable research results. Classic search engines basically struggle to catch the nuanced meaning of user queries, giving to suboptimal results. To address this challenge, we give the merge of advanced natural language processing (NLP) techniques with Elasticsearch, and with a specific focus on Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art pre-trained language model. Our approach involves leveraging BERT's ability to analyze the contextual meaning of words within documents by sentence transformers as (SBERT) , enabling the search engine to grab the user queries and better under- stand semantics of the content as it is converted into vector embeddings making it understandable in the Elasticsearch server. By utilizing BERT's bidirectional attention mechanism, the search engine can discern the relationships between words, thereby capturing the contextual nuances that are crucial for accurate query interpretation. Through experimental validation and performance assessments, we demonstrate the efficacy of our semantic search engine in providing contextually relevant search results. This research contributes to the advancement of search technology by enhancing the intelligence of search engines, ultimately improving the user experience by giving context based research.
Journal Article
Secondary Operation Risk Assessment Method Integrating Graph Convolutional Networks and Semantic Embeddings
2025
In the power industry, secondary operation risk assessment is a critical step in ensuring operational safety. However, traditional assessment methods often rely on expert judgment, making it difficult to efficiently address the challenges posed by unstructured textual data and complex equipment relationships. To address this issue, this paper proposes a hybrid model that integrates graph convolutional networks (GCNs) with semantic embedding techniques. The model consists of two main components: the first constructs a domain-specific knowledge graph for the power industry and uses a GCN to extract structural information, while the second fine-tunes the RoBERTa pre-trained model to generate semantic embeddings for textual data. Finally, the model employs a hybrid similarity measurement mechanism that comprehensively considers both semantic and structural features, combining K-means clustering similarity search with a multi-node weighted evaluation method to achieve efficient and accurate risk assessment. The experimental results demonstrate that the proposed model significantly outperforms the traditional methods in key metrics, such as accuracy, recall, and F1 score, fully validating its practical application value in secondary operation scenarios within the power industry.
Journal Article
LOD search engine: A semantic search over linked data
by
Deepak, Akshay
,
Azad, Amisha
,
Azad, Hiteshwar kumar
in
Data search
,
Datasets
,
Electronic documents
2022
In the last few years, there has been a significant growth in the amount of data published in RDF and adoption of Linked Data principles. Every day, a large number of people and communities contribute to the publication of datasets as Linked Data on Linked Open Data (LOD) cloud. Due to a large size of LOD cloud on the Web and the RDF representation of linked dataset, searching and retrieving relevant data on the Web is a major challenge. Because the data is published in RDF triple format, i.e. an interlinked structure, traditional search engines are unable to perform searches on Linked Data. This article introduces LOD search engine, a novel semantic search engine that searches on Semantic Web documents (such as Linked Data or triples) to retrieve a set of relevant information based on user queries. For searching over triples, we proposed two semantic search methods: Forward Search and Backward Search. To improve search results, two new ranking methods have also been introduced: Domain Ranking and Triple Ranking. The proposed LOD search engine produced remarkable results and outperformed other semantic search engines. In the best-case scenario, the proposed LOD search engine outperforms the swoogle and falcons by 22.35%, 43.38% and 33.18% in terms of precision, recall, and F-Measure respectively.
Journal Article
Explainable self-supervised learning for medical image diagnosis based on DINO V2 model and semantic search
2025
Medical images have become indispensable for decision-making and significantly affect treatment planning. However, increasing medical imaging has widened the gap between medical images and available radiologists, leading to delays and diagnosis errors. Recent studies highlight the potential of deep learning (DL) in medical image diagnosis. However, their reliance on labelled data limits their applicability in various clinical settings. As a result, recent studies explore the role of self-supervised learning to overcome these challenges. Our study aims to address these challenges by examining the performance of self-supervised learning (SSL) in diverse medical image datasets and comparing it with traditional pre-trained supervised learning models. Unlike prior SSL methods that focus solely on classification, our framework leverages DINOv2’s embeddings to enable semantic search in medical databases (via Qdrant), allowing clinicians to retrieve similar cases efficiently. This addresses a critical gap in clinical workflows where rapid case The results affirmed SSL’s ability, especially DINO v2, to overcome the challenge associated with labelling data and provide an accurate diagnosis superior to traditional SL. DINO V2 provides 100%, 99%, 99%, 100 and 95% for classification accuracy of Lung cancer, brain tumour, leukaemia and Eye Retina Disease datasets, respectively. While existing SSL models (e.g., BYOL, SimCLR) lack interpretability, we uniquely combine DINOv2 with
ViT-CX
, a causal explanation method tailored for transformers. This provides
clinically actionable
heatmaps, revealing how the model localizes tumors/cellular patternsa feature absent in prior SSL medical imaging studies Furthermore, our research explores the impact of semantic search in the medical images domain and how it can revolutionize the querying process and provide semantic results alongside SSL and the Qudra Net dataset utilized to save the embedding of the developed model after the training process. Cosine similarity measures the distance between the image query and stored information in the embedding using cosine similarity. Our study aims to enhance the efficiency and accuracy of medical image analysis, ultimately improving the decision-making process.
Journal Article
A multi module a.i. system for intelligent health insurance support using retrieval augmented generation
by
Hanmante, Sejal
,
Patil, Shivani
,
Shahade, Aniket K.
in
639/705
,
692/700
,
Artificial Intelligence
2025
This research proposes a Retrieval-Augmented Generation (RAG)-based multi-module AI system designed to streamline interaction with health insurance information. Unlike prior approaches that treat conversational assistance, policy recommendation, and document retrieval as isolated tasks, our system unifies these modules into a single architecture. The framework integrates a chatbot for general insurance queries, a policy recommendation engine leveraging RAG with both structured and unstructured policy data, and a document retrieval module for clause-level search from uploaded policies. A distinct contribution is the inclusion of an evaluator agent that simulates human judgment to assess response quality across relevance, accuracy, clarity, and helpfulness—providing an automated feedback loop to improve performance over time. Experimental results demonstrate strong semantic retrieval (BERTScore F1 up to 0.84), robust recommendation capability (Hit@5 = 1.0, Recall@5 = 0.833), and effective clause retrieval from policy documents (BERTScore F1 = 0.8443). The novelty of this work lies in the domain-specific application of RAG with a modular architecture and quality-assessment agent, offering reduced hallucination risk, improved policy transparency, and user-focused insurance support.
Journal Article
An Art of Review on Conceptual based Information Retrieval
2021
Basically keywords are used to index and retrieve the documents for the user query in a conventional information retrieval systems. When more than one keywords are used for defining the single concept in the documents and in the queries, inaccurate and incomplete results were produced by keyword based retrieval systems. Additionally, manual interventions are required for determining the relationship between the related keywords in terms of semantics to produce the accurate results which have paved the way for semantic search. Various research work has been carried out on concept based information retrieval to tackle the difficulties that are caused by the conventional keyword search and the semantic search systems. This paper aims at elucidating various representation of text that is responsible for retrieving relevant search results, approaches along with the evaluation that are carried out in conceptual information retrieval, the challenges faced by the existing research to expatiate requirements of future research. In addition, the conceptual information that are extracted from the different sources for utilizing the semantic representation by the existing systems have been discussed.
Journal Article
Pretrained language models for semantics-aware data harmonisation of observational clinical studies in the era of big data
by
Zlatev, Zlatko
,
Dylag, Jakub J.
,
Boniface, Michael
in
Analysis
,
Artificial intelligence
,
Automation
2025
Background
In clinical research, there is a strong drive to leverage big data from population cohort studies and routine electronic healthcare records to design new interventions, improve health outcomes and increase the efficiency of healthcare delivery. However, realising these potential demands requires substantial efforts in harmonising source datasets and curating study data, which currently relies on costly, time-consuming and labour-intensive methods. We explore and assess the use of natural language processing (NLP) and unsupervised machine learning (ML) to address the challenges of big data semantic harmonisation and curation.
Methods
Our aim is to establish an efficient and robust technological foundation for the development of automated tools supporting data curation of large clinical datasets. We propose two AI based pipelines for automated semantic harmonisation: a pipeline for semantics-aware search for domain relevant variables and a pipeline for clustering of semantically similar variables. We evaluate pipeline performance using 94,037 textual variable descriptions from the English Longitudinal Study of Ageing (ELSA) database.
Results
We observe high accuracy of our Semantic Search pipeline, with an AUC of 0.899 (SD = 0.056). Our semantic clustering pipeline achieves a V-measure of 0.237 (SD = 0.157), which is on par with that of leading implementations in other relevant domains. Automation can significantly accelerate the process of dataset harmonisation. Manual labelling was performed at a speed of 2.1 descriptions per minute, with our automated labelling increasing speed to 245 descriptions per minute.
Conclusions
Our study findings underscore the potential of AI technologies, such as NLP and unsupervised ML, in automating the harmonisation and curation of big data for clinical research. By establishing a robust technological foundation, we pave the way for the development of automated tools that streamline the process, enabling health data scientists to leverage big data more efficiently and effectively in their studies and accelerating insights from data for clinical benefit.
Journal Article