Catalogue Search | MBRL

Natural language processing: state of the art, current trends and challenges

by Koli, Aditya , Khurana, Diksha , Khatter, Kiran in Alliances , Automatic summarization , Computer science

2023

Natural language processing (NLP) has recently gained much attention for representing and analyzing human language computationally. It has spread its applications in various fields such as machine translation, email spam detection, information extraction, summarization, medical, and question answering etc. In this paper, we first distinguish four phases by discussing different levels of NLP and components of Natural Language Generation followed by presenting the history and evolution of NLP. We then discuss in detail the state of the art presenting the various applications of NLP, current trends, and challenges. Finally, we present a discussion on some available datasets, models, and evaluation metrics in NLP.

Journal Article

Share this book

Add to My Shelf

An Efficient Spam SMS Analysis Model based on Multinomial Naïve Bayes model Using Passive Aggressive Algorithm

by Shobana, J. , Kanchana, D. in Machine Learning

2021

The social media can be a platform for information consumption nowadays. On the one hand, it’s free of cost, easy access, and different data dissemination lead people to hunt out and consume social media news. On the contrary, it allows for the broad spread of “spams,” i.e., inferiority news with deliberately false information. The widespread spread of spams has the potential for very negative impacts on people and society. Consequently, the detection of spam on social media has recently become an important research that draws tremendous attention. NLP, an artificial intelligence (AI) division, uses computers and human natural language to produce useful data. In text classification activities, such as spam detection and sentiment analysis, text generation, language translations and document classification, NLP is widely used.

Journal Article

Share this book

Add to My Shelf

The class imbalance problem in deep learning

by Japkowicz, Nathalie , Corizzo, Roberto , Krawczyk, Bartosz in Artificial Intelligence , class imbalance , Computer Science

2024

Deep learning has recently unleashed the ability for Machine learning (ML) to make unparalleled strides. It did so by confronting and successfully addressing, at least to a certain extent, the knowledge bottleneck that paralyzed ML and artificial intelligence for decades. The community is currently basking in deep learning’s success, but a question that comes to mind is: have all of the issues previously affecting machine learning systems been solved by deep learning or do some issues remain for which deep learning is not a bulletproof solution? This question in the context of the class imbalance becomes a motivation for this paper. Imbalance problem was first recognized almost three decades ago and has remained a critical challenge at least for traditional learning approaches. Our goal is to investigate whether the tight dependency between class imbalances, concept complexities, dataset size and classifier performance, known to exist in traditional learning systems, is alleviated in any way in deep learning approaches and to what extent, if any, network depth and regularization can help. To answer these questions we conduct a survey of the recent literature focused on deep learning and the class imbalance problem as well as a series of controlled experiments on both artificial and real-world domains. This allows us to formulate lessons learned about the impact of class imbalance on deep learning models, as well as pose open challenges that should be tackled by researchers in this field.

Journal Article

Share this book

Add to My Shelf

Classifier calibration: a survey on how to assess and improve predicted class probabilities

by Perello-Nieto, Miquel , Song, Hao , Santos-Rodriguez, Raul in Artificial Intelligence , Calibration , Classification

2023

This paper provides both an introduction to and a detailed overview of the principles and practice of classifier calibration. A well-calibrated classifier correctly quantifies the level of uncertainty or confidence associated with its instance-wise predictions. This is essential for critical applications, optimal decision making, cost-sensitive classification, and for some types of context change. Calibration research has a rich history which predates the birth of machine learning as an academic field by decades. However, a recent increase in the interest on calibration has led to new methods and the extension from binary to the multiclass setting. The space of options and issues to consider is large, and navigating it requires the right set of concepts and tools. We provide both introductory material and up-to-date technical details of the main concepts and methods, including proper scoring rules and other evaluation metrics, visualisation approaches, a comprehensive account of post-hoc calibration methods for binary and multiclass classification, and several advanced topics.

Journal Article

Share this book

Add to My Shelf

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

by Eyke, Hüllermeier , Waegeman Willem in Machine learning , Statistical analysis , Uncertainty

2021

The notion of uncertainty is of major importance in machine learning and constitutes a key element of machine learning methodology. In line with the statistical tradition, uncertainty has long been perceived as almost synonymous with standard probability and probabilistic predictions. Yet, due to the steadily increasing relevance of machine learning for practical applications and related issues such as safety requirements, new problems and challenges have recently been identified by machine learning scholars, and these problems may call for new methodological developments. In particular, this includes the importance of distinguishing between (at least) two different types of uncertainty, often referred to as aleatoric and epistemic. In this paper, we provide an introduction to the topic of uncertainty in machine learning as well as an overview of attempts so far at handling uncertainty in general and formalizing this distinction in particular.

Journal Article

Share this book

Add to My Shelf

An Enhanced Aspect-Based Sentiment Analysis Model Based on RoBERTa For Text Sentiment Analysis

by Mohana, Rajni , Chauhan, Amit , Sharma, Aman

2025

Using an aspect-based sentiment analysis task, sentiment polarity towards specific aspect phrases within the same sentence or document is to be identified. The process of mechanically determining the underlying attitude or opinion indicated in the text is known as sentiment analysis. One of the most important aspects of natural language processing is sentiment analysis. The RoBERTa transformer model was pretrained in a self-supervised manner using a substantial corpus of English data. This means it was pretrained solely with raw texts and an algorithmic process to generate inputs and labels from those texts. No human labelling was involved, allowing it to utilise a vast amount of publicly available data. The authors of this work provide a thorough investigation of aspect-based sentiment analysis with RoBERTa. The RoBERTa model and its salient characteristics are outlined in this work, followed by an analysis of the model’s optimisation by the authors for aspect-based sentiment analysis. The authors compare the RoBERTa model with other state-of-the-art models and evaluate its performance on multiple benchmark datasets. Our experimental results show that the RoBERTa model is effective for this important natural language processing task, outperforming competing models on sentiment analysis tasks. Based on the SemEval-2014 variant benchmarking datasets, the restaurant and laptop domains have the highest accuracy, scoring 92.35 % and 82.33 %, respectively.

Journal Article

Share this book

Add to My Shelf

A survey on semi-supervised learning

by van Engelen Jesper E , Hoos, Holger H in Algorithms , Classification , Clustering

2020

Semi-supervised learning is the branch of machine learning concerned with using labelled as well as unlabelled data to perform certain learning tasks. Conceptually situated between supervised and unsupervised learning, it permits harnessing the large amounts of unlabelled data available in many use cases in combination with typically smaller sets of labelled data. In recent years, research in this area has followed the general trends observed in machine learning, with much attention directed at neural network-based models and generative learning. The literature on the topic has also expanded in volume and scope, now encompassing a broad spectrum of theory, algorithms and applications. However, no recent surveys exist to collect and organize this knowledge, impeding the ability of researchers and engineers alike to utilize it. Filling this void, we present an up-to-date overview of semi-supervised learning methods, covering earlier work as well as more recent advances. We focus primarily on semi-supervised classification, where the large majority of semi-supervised learning research takes place. Our survey aims to provide researchers and practitioners new to the field as well as more advanced readers with a solid understanding of the main approaches and algorithms developed over the past two decades, with an emphasis on the most prominent and currently relevant work. Furthermore, we propose a new taxonomy of semi-supervised classification algorithms, which sheds light on the different conceptual and methodological approaches for incorporating unlabelled data into the training process. Lastly, we show how the fundamental assumptions underlying most semi-supervised learning algorithms are closely connected to each other, and how they relate to the well-known semi-supervised clustering assumption.

Journal Article

Share this book

Add to My Shelf

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

by Li, Jerry , Paduraru Cosmin , Hester, Todd in Algorithms , Benchmarks , Decision analysis

2021

Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems. For each challenge, we define it formally in the context of a Markov Decision Process, analyze the effects of the challenge on state-of-the-art learning algorithms, and present some existing attempts at tackling it. We believe that an approach that addresses our set of proposed challenges would be readily deployable in a large number of real world problems. Our proposed challenges are implemented in a suite of continuous control environments called realworldrl-suite which we propose an as an open-source benchmark.

Journal Article

Share this book

Add to My Shelf

Hybrid approaches to optimization and machine learning methods: a systematic literature review

by Pereira, Ana I. , Azevedo, Beatriz Flamia , Rocha, Ana Maria A. C. in Algorithms , Artificial Intelligence , Bibliometrics

2024

Notably, real problems are increasingly complex and require sophisticated models and algorithms capable of quickly dealing with large data sets and finding optimal solutions. However, there is no perfect method or algorithm; all of them have some limitations that can be mitigated or eliminated by combining the skills of different methodologies. In this way, it is expected to develop hybrid algorithms that can take advantage of the potential and particularities of each method (optimization and machine learning) to integrate methodologies and make them more efficient. This paper presents an extensive systematic and bibliometric literature review on hybrid methods involving optimization and machine learning techniques for clustering and classification. It aims to identify the potential of methods and algorithms to overcome the difficulties of one or both methodologies when combined. After the description of optimization and machine learning methods, a numerical overview of the works published since 1970 is presented. Moreover, an in-depth state-of-art review over the last three years is presented. Furthermore, a SWOT analysis of the ten most cited algorithms of the collected database is performed, investigating the strengths and weaknesses of the pure algorithms and detaching the opportunities and threats that have been explored with hybrid methods. Thus, with this investigation, it was possible to highlight the most notable works and discoveries involving hybrid methods in terms of clustering and classification and also point out the difficulties of the pure methods and algorithms that can be strengthened through the inspirations of other methodologies; they are hybrid methods.

Journal Article

Share this book

Add to My Shelf

F: an interpretable transformation of the F-measure

by Kirielle Nishadi , Hand, David J , Christen, Peter in Algorithms , F stars , Questions

2021

The F-measure, also known as the F1-score, is widely used to assess the performance of classification algorithms. However, some researchers find it lacking in intuitive interpretation, questioning the appropriateness of combining two aspects of performance as conceptually distinct as precision and recall, and also questioning whether the harmonic mean is the best way to combine them. To ease this concern, we describe a simple transformation of the F-measure, which we call F∗ (F-star), which has an immediate practical interpretation.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter