Catalogue Search | MBRL

SplitNet: Learnable Clean-Noisy Label Splitting for Learning with Noisy Labels

by Kim, Seungryong , Ryoo, Kwangrok , Cho, Hansang in Artificial Intelligence , Computer Imaging , Computer Science

2025

Annotating the dataset with high-quality labels is crucial for deep networks’ performance, but in real-world scenarios, the labels are often contaminated by noise. To address this, some methods were recently proposed to automatically split clean and noisy labels among training data, and learn a semi-supervised learner in a Learning with Noisy Labels (LNL) framework. However, they leverage a handcrafted module for clean-noisy label splitting, which induces a confirmation bias in the semi-supervised learning phase and limits the performance. In this paper, for the first time, we present a learnable module for clean-noisy label splitting, dubbed SplitNet, and a novel LNL framework which complementarily trains the SplitNet and main network for the LNL task. We also propose to use a dynamic threshold based on split confidence by SplitNet to optimize the semi-supervised learner better. To enhance SplitNet training, we further present a risk hedging method. Our proposed method performs at a state-of-the-art level, especially in high noise ratio settings on various LNL benchmarks.

Journal Article

Share this book

Add to My Shelf

Typicality- and instance-dependent label noise-combating: a novel framework for simulating and combating real-world noisy labels for endoscopic polyp classification

by Wang, Yuanyuan , Fu, Junhu , Gao, Yun in AI and XR for Biomedical Applications , Algorithms , CAE) and Design

2024

Learning with noisy labels aims to train neural networks with noisy labels. Current models handle instance-independent label noise (IIN) well; however, they fall short with real-world noise. In medical image classification, atypical samples frequently receive incorrect labels, rendering instance-dependent label noise (IDN) an accurate representation of real-world scenarios. However, the current IDN approaches fail to consider the typicality of samples, which hampers their ability to address real-world label noise effectively. To alleviate the issues, we introduce typicality- and instance-dependent label noise (TIDN) to simulate real-world noise and establish a TIDN-combating framework to combat label noise. Specifically, we use the sample’s distance to decision boundaries in the feature space to represent typicality. The TIDN is then generated according to typicality. We establish a TIDN-attention module to combat label noise and learn the transition matrix from latent ground truth to the observed noisy labels. A recursive algorithm that enables the network to make correct predictions with corrections from the learned transition matrix is proposed. Our experiments demonstrate that the TIDN simulates real-world noise more closely than the existing IIN and IDN. Furthermore, the TIDN-combating framework demonstrates superior classification performance when training with simulated TIDN and actual real-world noise.

Journal Article

Share this book

Add to My Shelf

Robust Object Re-identification with Coupled Noisy Labels

by Yang, Mouxing , Huang, Zhenyu , Peng, Xi in Algorithms , Annotations , Labeling

2024

In this paper, we reveal and study a new challenging problem faced by object Re-IDentification (ReID), i.e., Coupled Noisy Labels (CNL) which refers to the Noisy Annotation (NA) and the accompanied Noisy Correspondence (NC). Specifically, NA refers to the wrongly-annotated identity of samples during manual labeling, and NC refers to the mismatched training pairs including false positives and false negatives whose correspondences are established based on the NA. Clearly, CNL will limit the success of the object ReID paradigm that simultaneously performs identity-aware discrimination learning on the data samples and pairwise similarity learning on the training pairs. To overcome this practical but ignored problem, we propose a robust object ReID method dubbed Learning with Coupled Noisy Labels (LCNL). In brief, LCNL first estimates the annotation confidences of samples and then adaptively divides the training pairs into four groups with the confidences to rectify the correspondences. After that, LCNL employs a novel objective function to achieve robust object ReID with theoretical guarantees. To verify the effectiveness of LCNL, we conduct extensive experiments on five benchmark datasets in single- and cross-modality object ReID tasks compared with 14 algorithms. The code could be accessed from https://github.com/XLearning-SCU/2024-IJCV-LCNL.

Journal Article

Share this book

Add to My Shelf

Multi-label noisy samples in underwater inspection from the oil and gas industry

by Pacheco, Fourth Marco , Pereira, Second Amanda , Koher, Third Manoela in Artificial Intelligence , Benchmarks , Classification

2024

Deep learning has shown remarkable success in various machine learning tasks, including multi-label classification. Multi-label classification is a supervised task where an input instance can be associated with multiple labels simultaneously, instead of exclusively one, as in the single-label scenario. When building a multi-label dataset for real-world applications, a recurrent problem is the presence of noisy labels. In this context, noisy labels refer to mislabeled data, which can potentially weaken the performance of supervised models. Although this issue may be well explored for single-label noise, it is still an emerging topic for multi-label applications. In this work, a novel deep learning model that handles multi-label noise is proposed, where we combine the Small Loss Approach Multi-label (SLAM) with a joint loss, in order to automatically identify and rectify noisy labels. The model outperforms in 2.5 % for the F1-score state-of-the-art (SOTA) models in the noisy version of the benchmark UcMerced. A new open noisy version of the benchmark TreeSATAI was developed and is now disclosed, where the performance gains reached 1.8 % in F-1 Score. Furthermore, the model was able to reduce the presence of noise from 25 % to 5 % in both sets. In addition, we evaluate the model on a real-world application of underwater inspections, to assist with the multi-label classification for an oil and gas company. Our model achieved gains in the F1-Score of 10 % when compared to a standard model (without noise-handling techniques), and up to 2.7 % and 1.9 % when compared to SOTA models SLAM and JoCoR, respectively.

Journal Article

Share this book

Add to My Shelf

An instance-dependent simulation framework for learning with label noise

by Masotto, Xander , Lakshminarayanan, Balaji , Bachani, Vandana in Algorithms , Artificial Intelligence , Computer Science

2023

We propose a simulation framework for generating instance-dependent noisy labels via a pseudo-labeling paradigm. We show that the distribution of the synthetic noisy labels generated with our framework is closer to human labels compared to independent and class-conditional random flipping. Equipped with controllable label noise, we study the negative impact of noisy labels across a few practical settings to understand when label noise is more problematic. We also benchmark several existing algorithms for learning with noisy labels and compare their behavior on our synthetic datasets and on the datasets with independent random label noise. Additionally, with the availability of annotator information from our simulation framework, we propose a new technique, Label Quality Model (LQM), that leverages annotator features to predict and correct against noisy labels. We show that by adding LQM as a label correction step before applying existing noisy label techniques, we can further improve the models’ performance. The synthetic datasets that we generated in this work are released at https://github.com/deepmind/deepmind-research/tree/master/noisy_label .

Journal Article

Share this book

Add to My Shelf

Variational Rectification Inference for Learning with Noisy Labels

by Yin, Yilong , Fan, Hehe , Wei, Qi in Adaptive sampling , Algorithms , Approximation

2025

Label noise has been broadly observed in real-world datasets. To mitigate the negative impact of overfitting to label noise for deep models, effective strategies (e.g., re-weighting, or loss rectification) have been broadly applied in prevailing approaches, which have been generally learned under the meta-learning scenario. Despite the robustness of noise achieved by the probabilistic meta-learning models, they usually suffer from model collapse that degenerates generalization performance. In this paper, we propose variational rectification inference (VRI) to formulate the adaptive rectification for loss functions as an amortized variational inference problem and derive the evidence lower bound under the meta-learning framework. Specifically, VRI is constructed as a hierarchical Bayes by treating the rectifying vector as a latent variable, which can rectify the loss of the noisy sample with the extra randomness regularization and is, therefore, more robust to label noise. To achieve the inference of the rectifying vector, we approximate its conditional posterior with an amortization meta-network. By introducing the variational term in VRI, the conditional posterior is estimated accurately and avoids collapsing to a Dirac delta function, which can significantly improve the generalization performance. The elaborated meta-network and prior network adhere to the smoothness assumption, enabling the generation of reliable rectification vectors. Given a set of clean meta-data, VRI can be efficiently meta-learned within the bi-level optimization programming. Besides, theoretical analysis guarantees that the meta-network can be efficiently learned with our algorithm. Comprehensive comparison experiments and analyses validate its effectiveness for robust learning with noisy labels, particularly in the presence of open-set noise.

Journal Article

Share this book

Add to My Shelf

Nrat: towards adversarial training with inherent label noise

by Mu, Ronghui , Wang, Fu , Ruan, Wenjie in Accuracy , Algorithms , Artificial Intelligence

2024

Adversarial training (AT) has been widely recognized as the most effective defense approach against adversarial attacks on deep neural networks and it is formulated as a min-max optimization. Most AT algorithms are geared towards research-oriented datasets such as MNIST, CIFAR10, etc., where the labels are generally correct. However, noisy labels, e.g., mislabelling, are inevitable in real-world datasets. In this paper, we investigate AT with inherent label noise, where the training dataset itself contains mislabeled samples. We first empirically show that the performance of AT typically degrades as the label noise rate increases. Then, we propose a Noisy-Robust Adversarial Training (NRAT) algorithm, which leverages the recent advancements in learning with noisy labels to enhance the performance of AT in the presence of label noise. For experimental comparison, we consider two essential metrics in AT: (i) trade-off between natural and robust accuracy; (ii) robust overfitting. Our experiments show that NRAT’s performance is on par with, or better than, the state-of-the-art AT methods on both evaluation metrics. Our code is publicly available at: https://github.com/TrustAI/NRAT .

Journal Article

Share this book

Add to My Shelf

Learning with Noisy Correspondence

by Niu, Guocheng , Lv, Jiancheng , Xiao, Xinyan in Annotations , Data points , Effectiveness

2024

This paper studies a new learning paradigm for noisy labels, i.e., noisy correspondence (NC). Unlike the well-studied noisy labels that consider the errors in the category annotation of a sample, the NC refers to the errors in the alignment relationship of two data points. Although such false positive pairs are common especially in the data harvested from the Internet, which however are neglected by most existing works. By taking cross-modal retrieval as a showcase, we propose a method called learning with noisy correspondence (LNC). In brief, the LNC first roughly obtains the clean and noisy subsets from the original data and then rectifies the false positive pairs by using a novel adaptive prediction function. Finally, the LNC adopts a novel triplet loss with soft margins to endow cross-modal retrieval the robustness to the NC. To verify the effectiveness of the proposed LNC, we conduct experiments on six benchmark datasets in image-text and video-text retrieval tasks. Besides the effectiveness of the LNC, the experimental results show the necessity of the explicit solution to the NC faced by not only the standard model training paradigm but also the pre-training and fine-tuning paradigms.

Journal Article

Share this book

Add to My Shelf

A Robust Dynamic Classifier Selection Approach for Hyperspectral Images with Imprecise Label Information

by de Cooman, Gert , Pižurica, Aleksandra , Huang, Shaoguang in dynamic classifier selection , hyperspectral images , imprecise probabilities

2020

Supervised hyperspectral image (HSI) classification relies on accurate label information. However, it is not always possible to collect perfectly accurate labels for training samples. This motivates the development of classifiers that are sufficiently robust to some reasonable amounts of errors in data labels. Despite the growing importance of this aspect, it has not been sufficiently studied in the literature yet. In this paper, we analyze the effect of erroneous sample labels on probability distributions of the principal components of HSIs, and provide in this way a statistical analysis of the resulting uncertainty in classifiers. Building on the theory of imprecise probabilities, we develop a novel robust dynamic classifier selection (R-DCS) model for data classification with erroneous labels. Particularly, spectral and spatial features are extracted from HSIs to construct two individual classifiers for the dynamic selection, respectively. The proposed R-DCS model is based on the robustness of the classifiers’ predictions: the extent to which a classifier can be altered without changing its prediction. We provide three possible selection strategies for the proposed model with different computational complexities and apply them on three benchmark data sets. Experimental results demonstrate that the proposed model outperforms the individual classifiers it selects from and is more robust to errors in labels compared to widely adopted approaches.

Journal Article

Share this book

Add to My Shelf

Where the White Continent Is Blue: Deep Learning Locates Bare Ice in Antarctica

by Zekollari, Harry , Tollenaar, Veronica , Pattyn, Frank in Albedo , Algorithms , Antarctica

2024

In some areas of Antarctica, blue‐colored bare ice is exposed at the surface. These blue ice areas (BIAs) can trap meteorites or old ice and are vital for understanding the climatic history. By combining multi‐sensor remote sensing data (MODIS, RADARSAT‐2, and TanDEM‐X) in a deep learning framework, we map blue ice across the continent at 200‐m resolution. We use a novel methodology for image segmentation with “noisy” labels to learn an underlying “clean” pattern with a neural network. In total, BIAs cover ca. 140,000 km2 (∼1%) of Antarctica, of which nearly 50% located within 20 km of the grounding line. There, the low albedo of blue ice enhances melt‐water production and its mapping is crucial for mass balance studies that determine the stability of the ice sheet. Moreover, the map provides input for fieldwork missions and can act as constraint for other geophysical mapping efforts. Plain Language Summary While most of the continent of Antarctica is covered by snow, in some areas, ice is exposed at the surface, with a typical blue color. At lower elevations, blue ice enhances melt‐water production, which is important for studying the future of the ice sheet. Moreover, scientific teams frequently visit blue ice areas (BIAs) as they act as traps for meteorites and very old ice. In this study, we map the extent and the exact location of BIAs using various satellite observations. These diverse observations are efficiently combined in an artificial intelligence algorithm. We develop the algorithm so that it can learn to map blue ice even though existing training labels, which teach the algorithm what blue ice looks like, are imperfect. We quantify that the new map scores better on various performance metrics compared to the current most‐used blue ice map. Moreover, for the first time, we estimate uncertainties of the detection of blue ice. The map indicates that ca. 1% of the surface of Antarctica exposes blue ice and will be important for fieldwork missions and understanding surface processes leading to melt and potential sea level rise. Key Points We map blue ice areas in Antarctica by combining multi‐sensor satellite observations in a convolutional neural network Blue ice covers ca. 140,000 km2 (∼1%) of Antarctica, of which ca. 50% located in the grounding zone Our map will improve mass balance estimates and studies on ice‐shelf stability, and will support searches for meteorites or old ice

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter