Catalogue Search | MBRL

Ranks Aggregation and Semantic Genetic Approach based Hybrid Model for Query Expansion

by Singh, Jagendra in Feedback , Filtration , genetic algorithm

2017

Effective query expansion terms selection methods are really very important for improving the accuracy and efficiency of Pseudo-Relevance Feedback (PRF) based automatic query expansion techniques in information retrieval system. These methods remove irrelevant and redundant terms from the top retrieved feedback documents with respect to a user query. Individual terms selection methods have been widely investigated for improving its performance. However, it is always a challenging task to find an individual expansion terms selection method that would outperform other individual methods in most cases. In this paper, first we explore the possibility of improving the overall performance using individual terms selection methods. Second, we propose a model for combining multiple expansion terms selection methods by using a variety of ranks combining approaches. Third, semantic filtering used to filter out semantically irrelevant terms obtained after combining multiple terms selection methods. Fourth, the Genetic Algorithm used to make an optimal combination of query terms and candidate expansion terms obtained by applying ranks combination and semantic filtering approach. Our experimental results demonstrated that our proposed approaches achieved a significant improvement over each individual terms selection methods and related state-of-the-arts approaches.

Journal Article

Share this book

Add to My Shelf

A contemporary combined approach for query expansion

by Pamula, Rajendra , Chauhan, D. S. , Sharma, Dilip Kumar in Algorithms , Computer Communication Networks , Computer Science

2022

The use of an automatic query expansion technique is to enhance the performance of the Information Retrieval System. Selecting the candidate terms for query expansion is an essential task to make query more precise to extract the most suitable documents. This paper provides a method to select the best terms for query enhancement. Firstly, the effect of abbreviation resolution , Lexical Variation, Synonyms, n-gram pseudo-relevance feedback, Co-occurrence method on baseline approaches of query expansion is analyzed.. In this work, we used the Okapi BM25 algorithm for ranking. We used Concept-based normalization to deal with concept terms. Here our results show the improvement in results than the baseline approach. A new combined technique that integrates lexical variation, synonyms, n-gram pseudo relevance feedback for query enhancement is proposed. For experimental purpose three English written datasets CACM, CISI, and TREC-3 is used. The obtained results show improvement in the performance of query expansion concerning mean average precision, F-measure, and precision-recall curve.

Journal Article

Share this book

Add to My Shelf

Fine-Tuned BERT Algorithm-Based Automatic Query Expansion for Enhancing Document Retrieval System

by Vishwakarma, Deepak , Kumar, Suresh in Accuracy , Algorithms , Artificial Intelligence

2025

Online retrieval systems are mostly web-based, which makes document collecting more dynamic or fluid than in traditional information retrieval systems. With the web growing in size every day, finding meaningful information on it using a search query consisting of only a few keywords which has become increasingly difficult. One important factor in making Internet searches better is query expansion, or QE. Manual query expansion method involves the user adding terms to the query, which takes a long time but produces good results. However, the automatic query expansion (AQE) method determines the best statements with minimal time consumption. Therefore, to improve document retrieval system, a fine-tuned BERT algorithm is developed for automatic query expansion. Initially, the input text was augmented using embedding augmentation (EA) approach. The augmented text was pre-processed using tokenization, normalization, splitting, stemming, stop word removal, as well as lemmatization. Then extracting the technical keywords from the pre-processed text using co-occurrence statistical information. After extracting the keywords, a fine-tuned BERT model is utilized for expanding the query to improve document retrieval system. The hyper parameters present in the BERT was tuned using frilled lizard optimization to enhance the performance of the BERT model. Proposed model provides 92% accuracy, 95% precision, and 95.6% recall. Thus, a fine-tuned BERT model minimizing query-document mismatch and thereby improving retrieval performance.

Journal Article

Share this book

Add to My Shelf

A supervised term ranking model for diversity enhanced biomedical information retrieval

by Xu, Kan , Xu, Bo , Wang, Jian in Algorithms , Analysis , Apoptosis

2019

Background The number of biomedical research articles have increased exponentially with the advancement of biomedicine in recent years. These articles have thus brought a great difficulty in obtaining the needed information of researchers. Information retrieval technologies seek to tackle the problem. However, information needs cannot be completely satisfied by directly introducing the existing information retrieval techniques. Therefore, biomedical information retrieval not only focuses on the relevance of search results, but also aims to promote the completeness of the results, which is referred as the diversity-oriented retrieval. Results We address the diversity-oriented biomedical retrieval task using a supervised term ranking model. The model is learned through a supervised query expansion process for term refinement. Based on the model, the most relevant and diversified terms are selected to enrich the original query. The expanded query is then fed into a second retrieval to improve the relevance and diversity of search results. To this end, we propose three diversity-oriented optimization strategies in our model, including the diversified term labeling strategy, the biomedical resource-based term features and a diversity-oriented group sampling learning method. Experimental results on TREC Genomics collections demonstrate the effectiveness of the proposed model in improving the relevance and the diversity of search results. Conclusions The proposed three strategies jointly contribute to the improvement of biomedical retrieval performance. Our model yields more relevant and diversified results than the state-of-the-art baseline models. Moreover, our method provides a general framework for improving biomedical retrieval performance, and can be used as the basis for future work.

Journal Article

Share this book

Add to My Shelf

Integrating, Indexing and Querying the Tangible and Intangible Cultural Heritage Available Online: The QueryLab Portal

by Gagliardi, Isabella , Artese, Maria Teresa in Art galleries & museums , common metadata model , Community participation

2022

Cultural heritage inventories have been created to collect and preserve the culture and to allow the participation of stakeholders and communities, promoting and disseminating their knowledges. There are two types of inventories: those who give data access via web services or open data, and others which are closed to external access and can be visited only through dedicated web sites, generating data silo problems. The integration of data harvested from different archives enables to compare the cultures and traditions of places from opposite sides of the world, showing how people have more in common than expected. The purpose of the developed portal is to provide query tools managing the web services provided by cultural heritage databases in a transparent way, allowing the user to make a single query and obtain results from all inventories considered at the same time. Moreover, with the introduction of the ICH-Light model, specifically studied for the mapping of intangible heritage, data from inventories of this domain can also be harvested, indexed and integrated into the portal, allowing the creation of an environment dedicated to intangible data where traditions, knowledges, rituals and festive events can be found and searched all together.

Journal Article

Share this book

Add to My Shelf

LTR-expand: query expansion model based on learning to rank association rules

by Gaussier, Eric , Bouziri Ahlem , Latiri Chiraz in Information retrieval , Learning , Queries

2020

Query Expansion (QE) is widely applied to improve the retrieval performance of ad-hoc search, using different techniques and several data sources to find expansion terms. In Information Retrieval literature, selecting expansion terms remains a challenging task that relies on the extraction of term relationships. In this paper, we propose a new learning to rank-based query expansion model. The main idea behind is that, given a query and the set of its related ARs, our model ranks these ARs according to their relevance score regarding to this query and then selects the most suitable ones to be used in the QE process. Experiments are conducted on three test collections, namely: CLEF2003, TREC-Robust and TREC-Microblog, including long, hard and short queries. Results showed that the retrieval performance can be significantly improved when the ARs ranking method is used compared to other state of the art expansion models, especially for hard and long queries.

Journal Article

Share this book

Add to My Shelf

An implicit aspect modelling framework for diversity focused query expansion

by Balasubramanian Vidhya , Dev, Rahul E in Algorithms , Empirical analysis , Markov chains

2020

Diversified Query Expansion aims to present the user with a diverse list of query expansions so as to better communicate their intent to the retrieval system. Current diversified expansion techniques either make use of external knowledge sources to explicitly model the various aspects and their relationships underlying the user query or implicitly model query aspects. However these techniques assume query aspects to be independent of each other. We propose a unified framework that produces diversified query expansions in a completely implicit manner while also considering the relationships between query aspects. In particular, the framework identifies query aspects and their relationships by making use of the semantic properties of context phrases that occur within the top-ranked retrieved documents for the supplied user query, and maps them onto a Mutating Markov Chain model to generate a diverse ordering of query aspects. We test our framework against a set of ambiguous and faceted queries used in the NTCIR-12 IMine-2 Task and through an extensive empirical analysis, we show that our framework consistently outperforms existing implicit diversified query expansion algorithms. The utility of our algorithm truly comes up in the second set of experiments where we generate diversified query expansions for a retrieval engine indexing documents from specific scientific domains. Even in such a niche scenario our algorithm consistently provides robust results and performs better than other implicit approaches.

Journal Article

Share this book

Add to My Shelf

Optimal Query Expansion Based on Hybrid Group Mean Enhanced Chimp Optimization Using Iterative Deep Learning

by Tripathi, Kuldeep Narayan , Sharma, Subhash Chander , Kumar, Ram in Accuracy , Algorithms , Business metrics

2022

The internet is surrounded by uncertain information which necessitates the usage of natural language processing and soft computing techniques to extract the relevant documents. The relevant results are retrieved using the query expansion technique which is mainly formulated using the machine learning or deep learning concepts in the existing literature. This paper presents a hybrid group mean-based optimizer-enhanced chimp optimization (GMBO-ECO) algorithm for pseudo-relevance-based query expansion, whereby the actual queries are expanded with their related keywords. The hybrid GMBO-ECO algorithm mainly expands the query based on the terms that have a strong interrelationship with the actual query. To generate the word embeddings, a Word2Vec paradigm is used which learns the word association from large text corpora. The useful context in the text is identified using the improved iterative deep learning framework which determines the user’s intent for the current web search. This step reduces the mismatch of the words and improves the performance of query retrieval. The weak terms are eliminated and the candidate query terms for optimal query expansion are improved via an Okapi measure and cosine similarity techniques. The proposed methodology has been compared to the state-of-the-art methods with and without a query expansion approach. Moreover, the proposed optimal query expansion technique has shown a substantial improvement in terms of a normalized discounted cumulative gain of 0.87, a mean average precision of 0.35, and a mean reciprocal rank of 0.95. The experimental results show the efficiency of the proposed methodology in retrieving the appropriate response for information retrieval. The most common applications for the proposed method are search engines.

Journal Article

Share this book

Add to My Shelf

A New Hybrid Document Clustering for PRF-Based Automatic Query Expansion Approach for Effective IR

by Saini, Ashish , Gupta, Yogesh in Algorithms , Analysis , Clustering

2020

Automatic query expansion (AQE) is an effective measure to improve information retrieval performance by including additional terms in a user query. The pseudo relevance feedback (PRF) method employed for AQE so far has suffered from a major problem of query drift. Therefore, keeping it in view, a new hybrid document clustering for PRF based AQE approach is proposed in the present article. In this, Fuzzy logic and Particle Swarm Optimization (PSO) are used to construct document clusters. Further, a new and effective hybrid PSO and Fuzzy logic-based term weighting approach is followed to find more suitable additional query terms using a weighted score of four IR evidences which is considered maximized. Moreover, a combined semantic filtering method along with query terms re-weighting algorithms are also used to remove noisy or irrelevant terms semantically. The performance of the presented approaches in this article is tested and compared with other approaches on three benchmark data sets. The comparative analysis of all the tested approaches illustrates the superior performance of the proposed approach.

Journal Article

Share this book

Add to My Shelf

Using Query Expansion Techniques and Content-Based Filtering for Personalizing Analysis in Big Data

by Derdour, Makhlouf , Menaceur, Sadek , Bouramoul, Abdelkrim in Big Data , Context , Cubes

2020

The recent debates on personalizing analyses in a Big Data context are one of the most solicited challenges for business intelligence (BI) administrators. The high-volume, the high-variety, and the high-velocity of Big Data have produced difficulty in storing, processing, and analyzing data in traditional systems. These 3Vs (volume, velocity, and variety) created many new challenges and make them difficult to extract the specific needs of the users. In addition, the user may be faced with the problem of disorientation; he does not know what information really corresponds to his needs. The information personalization systems aim to overcome these problems of disorientation by using a user profile. The effectiveness of the personalization system in a Big Data context is to demonstrate by the relevance and accuracy of the content of the results obtained, according to the needs of the user and the context of the research. Nevertheless, most of the recent research focused on the relational data warehouse personalizing and ignored the integration of the user context into the analysis of OLAP cubes, which is the first concerned to execute the user's multidimensional queries. To deal with this, the authors propose in this article a dynamic personalizing approach in Big Data context using OLAP cubes, based on the Content-Based Filtering, and the Query Expansion techniques. The first step in the proposal consists of processing the user queries by an enrichment technique in order to integrate the user profile and his searching context to reduce the searching space in the OLAP cube, and use the expansion technique to extend the scope of the analysis in the OLAP cube. The retrieved results are: “as relevant as possible” compared to the user's initial request. Afterward, they use information filtering techniques such as content-based filtering to personalize the analysis in the reduced data cube according to the term frequency and cosine similarity. Finally, they present a case study and experiences results to evaluate and validate their approach.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter