Catalogue Search | MBRL

Knowledge and separating soft verbalizer based prompt-tuning for multi-label short text classification

by Chen, Zhanwang , Hu, Xuegang , Li, Peipei in Classification , Coding , Labels

2024

Multi-label Short Text Classification (MSTC) is a challenging subtask of Multi-Label Text Classification (MLTC) for tagging a short text with the most relevant subset of labels from a given set of labels. Recent studies have attempted to address MSTC task using MLTC methods and Pre-trained Language Models (PLM) based fine-tuning approaches, but suffering the low performance from the following three reasons, 1) failure to address the issue of data sparsity of short texts, 2) lack of adaptation to the long-tail distribution of labels in multi-label scenarios and 3) an implicit weakness in the encoding length for PLM, which limits the ability of the prompt learning paradigm. Therefore, in this paper, we propose KSSVPT, a Knowledge and Separating Soft Verbalizer based Prompt Tuning method for MSTC to address the above challenges. Firstly, to mitigate the sparsity issue in short texts, we propose a novel approach that enhances the semantic information of short texts by integrating external knowledge into the soft prompt template. Secondly, we construct a new soft prompt verbalizer for MSTC, called separating soft prompt verbalizer, to adapt to the long-tail distribution issue aggravated by multiple labels. Thirdly, we propose a mechanism of label cluster grouping in building a prompt template to directly alleviate limited encoding length and capture the label correlation. Extensive experiments conducted on six benchmark datasets demonstrate the superiority of our model compared to all competing models for MLTC and MSTC in the tackling of MSTC task.

Journal Article

Share this book

Add to My Shelf

Residual diverse ensemble for long-tailed multi-label text classification

by Shi, Jiangxin , Wei, Tong , Li, Yufeng in Classification , Computer Science , Datasets

2024

Long-tailed multi-label text classification aims to identify a subset of relevant labels from a large candidate label set, where the training datasets usually follow long-tailed label distributions. Many of the previous studies have treated head and tail labels equally, resulting in unsatisfactory performance for identifying tail labels. To address this issue, this paper proposes a novel learning method that combines arbitrary models with two steps. The first step is the “diverse ensemble” that encourages diverse predictions among multiple shallow classifiers, particularly on tail labels, and can improve the generalization of tail labels. The second is the “error correction” that takes advantage of accurate predictions on head labels by the base model and approximates its residual errors for tail labels. Thus, it enables the “diverse ensemble” to focus on optimizing the tail label performance. This overall procedure is called residual diverse ensemble (RDE). RDE is implemented via a single-hidden-layer perceptron and can be used for scaling up to hundreds of thousands of labels. We empirically show that RDE consistently improves many existing models with considerable performance gains on benchmark datasets, especially with respect to the propensity-scored evaluation metrics. Moreover, RDE converges in less than 30 training epochs without increasing the computational overhead.

Journal Article

Share this book

Add to My Shelf

Label prompt for multi-label text classification

by Zhang, Zhiqi , Wang, Xiaoguang , Xu, Hao in Classification , Correlation , Datasets

2023

Multi-label text classification has been widely concerned by scholars due to its contribution to practical applications. One of the key challenges in multi-label text classification is how to extract and leverage the correlation among labels. However, it is quite challenging to directly model the correlations among labels in a complex and unknown label space. In this paper, we propose a Label Prompt Multi-label Text Classification model (LP-MTC), which is inspired by the idea of prompt learning of pre-trained language model. Specifically, we design a set of templates for multi-label text classification, integrate labels into the input of the pre-trained language model, and jointly optimize by Masked Language Models (MLM). In this way, the correlations among labels as well as semantic information between labels and text with the help of self-attention can be captured, and thus the model performance is effectively improved. Extensive empirical experiments on multiple datasets demonstrate the effectiveness of our method. Compared with BERT, LP-MTC improved 3.4% micro-F1 on average over the four public datasets.

Journal Article

Share this book

Add to My Shelf

Multi-label text classification of Indonesian customer reviews using bidirectional encoder representations from transformers language model

by Yulianti, Evi , Nissa, Nuzulul Khairu

2023

Customer review is a critical resource to support the decision-making process in various industries. To understand how customers perceived each aspect of the product, we can first identify all aspects discussed in the customer reviews by performing multi-label text classification. In this work, we want to know the effectiveness of our two proposed strategies using bidirectional encoder representations from transformers (BERT) language model that was pre-trained on the Indonesian language, referred to as IndoBERT, to perform multi-label text classification. First, IndoBERT is used as feature representation to be combined with convolutional neural network-extreme gradient boosting (CNN-XGBoost). Second, IndoBERT is used both as the feature representation as well as the classifier to directly solve the classification task. Additional analysis is performed to compare our results with those using multilingual BERT model. According to our experimental results, our first model using IndoBERT as feature representation shows significant performance over some baselines. Our second model using IndoBERT as both feature representation and classifier can significantly enhance the effectiveness of our first model. In summary, our proposed models can improve the effectiveness of the baseline using Word2Vec-CNN-XGBoost by 19.19% and 6.17%, in terms of accuracy and F-1 score, respectively.

Journal Article

Share this book

Add to My Shelf

Job Resumes Recommendation using Integration of Fuzzy Discernibility Matrix Feature Selection and Convolutional Neural Network Multi-label Text Classification

by Yusoff, Nooraini , Hayami, Regiolina , Fatma, Yulia

2025

The manual analysis of job resumes poses specific challenges, including the time-intensive process and the high likelihood of human error, emphasizing the need for automation in content-based recommendations. Recent advancements in deep learning, particularly Convolutional Neural Networks (CNN) for Multi-label Text Classification (MLTC), offer a promising solution for addressing these challenges through artificial intelligence. While CNN is renowned for its robust feature extraction capabilities, it faces specific challenges such as managing high-dimensional data and poor data interpretability. To address these limitations, this study employs the Fuzzy Discernibility Matrix (FDM) feature selection technique to determine the relevance of skills for each job vacancy. FDM assigns weights to the features of each job category and ranks their relevance based on the highest scores, which are then utilized in the MLTC CNN model. The integration of FDM and MLTC CNN serves as the foundation for generating content-based recommendations derived from key features in job resumes. This study produces a content-based job recommendation system that displays the top three job categories, accompanied by explanations of the skills supporting each selected category. These recommendations also consider cosine similarity values between analyzed items and the integrated results of FDM and MLTC CNN. The application of FDM for feature weighting has been proven to enhance multi-label text classification outcomes by providing better insights during the feature selection process. With a recall of 97.26%, precision of 94.81%, and accuracy of 98.58%, the MLTC model integrating FDM feature selection and CNN demonstrates robust performance characteristics in content-based job recommendation tasks.

Journal Article

Share this book

Add to My Shelf

Hierarchical contrastive learning for multi-label text classification

by Jiang, Yun , Pan, Shuai , Zhang, Wei in 639/705/1042 , 639/705/117 , Contrastive learning

2025

Multi-label text classification presents a significant challenge within the field of text classification, particularly due to the hierarchical nature of labels, where labels are organized in a tree-like structure that captures parent-child and sibling relationships. This hierarchy reflects semantic dependencies among labels, with higher-level labels representing broader categories and lower-level labels capturing more specific distinctions. Traditional methods often fail to deeply understand and leverage this hierarchical structure, overlooking the subtle semantic differences and correlations that distinguish one label from another. To address this shortcoming, we introduce a novel method called Hierarchical Contrastive Learning for Multi-label Text Classification (HCL-MTC). Our approach leverages the contrastive knowledge embedded within label relationships by constructing a graph representation that explicitly models the hierarchical dependencies among labels. Specifically, we recast multi-label text classification as a multi-task learning problem, incorporating a hierarchical contrastive loss that is computed through a carefully designed sampling process. This unique loss function enables our model to effectively capture both the correlations and distinctions among labels, thereby enhancing the model’s ability to learn the intricacies of the label hierarchy. Experimental results on widely-used datasets, such as RCV1-v2 and WoS, demonstrate that our proposed HCL-MTC model achieves substantial performance gains compared to baseline methods.

Journal Article

Share this book

Add to My Shelf

A New Hybrid Based on Long Short-Term Memory Network with Spotted Hyena Optimization Algorithm for Multi-Label Text Classification

by Gharehchopogh, Farhad Soleimanian , Khataei Maragheh, Hamed , Majidzadeh, Kambiz in Classification , Datasets , Deep learning

2022

An essential work in natural language processing is the Multi-Label Text Classification (MLTC). The purpose of the MLTC is to assign multiple labels to each document. Traditional text classification methods, such as machine learning usually involve data scattering and failure to discover relationships between data. With the development of deep learning algorithms, many authors have used deep learning in MLTC. In this paper, a novel model called Spotted Hyena Optimizer (SHO)-Long Short-Term Memory (SHO-LSTM) for MLTC based on LSTM network and SHO algorithm is proposed. In the LSTM network, the Skip-gram method is used to embed words into the vector space. The new model uses the SHO algorithm to optimize the initial weight of the LSTM network. Adjusting the weight matrix in LSTM is a major challenge. If the weight of the neurons to be accurate, then the accuracy of the output will be higher. The SHO algorithm is a population-based meta-heuristic algorithm that works based on the mass hunting behavior of spotted hyenas. In this algorithm, each solution of the problem is coded as a hyena. Then the hyenas are approached to the optimal answer by following the hyena of the leader. Four datasets are used (RCV1-v2, EUR-Lex, Reuters-21578, and Bookmarks) to evaluate the proposed model. The assessments demonstrate that the proposed model has a higher accuracy rate than LSTM, Genetic Algorithm-LSTM (GA-LSTM), Particle Swarm Optimization-LSTM (PSO-LSTM), Artificial Bee Colony-LSTM (ABC-LSTM), Harmony Algorithm Search-LSTM (HAS-LSTM), and Differential Evolution-LSTM (DE-LSTM). The improvement of SHO-LSTM model accuracy for four datasets compared to LSTM is 7.52%, 7.12%, 1.92%, and 4.90%, respectively.

Journal Article

Share this book

Add to My Shelf

Enhancing power equipment defect identification through multi-label classification methods

by Li, Yong , Li, Sun , Zheng, Wenjie in 639/166/987 , 639/705/258 , Classification

2024

Accurate identification and classification of equipment defects are essential for assessing the health of power equipment and making informed maintenance decisions. Traditional defect classification methods, which rely on subjective manual records and intricate defect descriptions, have proven to be inefficient. Existing approaches often evaluate equipment status solely based on defect grade. To enhance the precision of power equipment defect identification, we developed a multi-label classification dataset by compiling historical defect records. Furthermore, we assessed the performance of 11 established multi-label classification methods on this dataset, encompassing both traditional machine learning and deep learning methods. Experimental results reveal that methods considering label correlations exhibit significant performance advantages. By employing balanced loss functions, we effectively address the challenge of sample imbalance across various categories, thereby enhancing classification accuracy. Additionally, segmenting the power equipment defect classification task, which involves numerous labels, into label recall and ranking stages can substantially improve classification performance. The dataset we have created is available for further research by other scholars in the field.

Journal Article

Share this book

Add to My Shelf

Label-specific multi-label text classification based on dynamic graph convolutional networks

by Yan, Yaoyao , Liu, Fang‘ai , Liu, Kenan in Adaptation , Algorithms , Artificial neural networks

2025

Multi-label text classification is a key task in natural language processing, aiming to assign each text to multiple predefined categories simultaneously. Existing neural network models usually learn the same text representation for different labels, which limits the effectiveness of the models in capturing deep semantics and distinguishing between similar labels; moreover, these models tend to ignore inter-label correlation, leading to loss of information. To overcome these limitations, we propose a novel label-specific dynamic graph convolutional network (LDGCN). This network combines convolutional operations and BiLSTM to model text sequences and obtains label-specific text representations through a label attention mechanism. In addition, LDGCN improves the dynamic graph convolutional network by utilizing statistical label co-occurrence and label reconstruction maps to effectively capture inter-label dependencies and adaptive interactions between label-specific semantic components. Extensive experiments on the RCV1, AAPD, and EUR-Lex datasets show that our model achieves 96.92%, 86.30%, and 81.42% on the P@1 metrics, respectively, and demonstrates a significant advantage in dealing with tail labels.

Journal Article

Share this book

Add to My Shelf

When graph convolution meets double attention: online privacy disclosure detection with multi-label text classification

by Li, Shujun , Liang, Zhanbo , Guo, Jie in Artificial neural networks , Attention , Classification

2024

With the rise of Web 2.0 platforms such as online social media, people’s private information, such as their location, occupation and even family information, is often inadvertently disclosed through online discussions. Therefore, it is important to detect such unwanted privacy disclosures to help alert people affected and the online platform. In this paper, privacy disclosure detection is modeled as a multi-label text classification (MLTC) problem, and a new privacy disclosure detection model is proposed to construct an MLTC classifier for detecting online privacy disclosures. This classifier takes an online post as the input and outputs multiple labels, each reflecting a possible privacy disclosure. The proposed presentation method combines three different sources of information, the input text itself, the label-to-text correlation and the label-to-label correlation. A double-attention mechanism is used to combine the first two sources of information, and a graph convolutional network is employed to extract the third source of information that is then used to help fuse features extracted from the first two sources of information. Our extensive experimental results, obtained on a public dataset of privacy-disclosing posts on Twitter, demonstrated that our proposed privacy disclosure detection method significantly and consistently outperformed other state-of-the-art methods in terms of all key performance indicators.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter