Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
788 result(s) for "adversarial robustness"
Sort by:
Prevalence of neural collapse during the terminal phase of deep learning training
Modern practice for training classification deepnets involves a terminal phase of training (TPT), which begins at the epoch where training error first vanishes. During TPT, the training error stays effectively zero, while training loss is pushed toward zero. Direct measurements of TPT, for three prototypical deepnet architectures and across seven canonical classification datasets, expose a pervasive inductive bias we call neural collapse (NC), involving four deeply interconnected phenomena. (NC1) Cross-example within-class variability of last-layer training activations collapses to zero, as the individual activations themselves collapse to their class means. (NC2) The class means collapse to the vertices of a simplex equiangular tight frame (ETF). (NC3) Up to rescaling, the last-layer classifiers collapse to the class means or in other words, to the simplex ETF (i.e., to a self-dual configuration). (NC4) For a given activation, the classifier’s decision collapses to simply choosing whichever class has the closest train class mean (i.e., the nearest class center [NCC] decision rule). The symmetric and very simple geometry induced by the TPT confers important benefits, including better generalization performance, better robustness, and better interpretability.
First three years of the international verification of neural networks competition (VNN-COMP)
This paper presents a summary and meta-analysis of the first three iterations of the annual International Verification of Neural Networks Competition (VNN-COMP), held in 2020, 2021, and 2022. In the VNN-COMP, participants submit software tools that analyze whether given neural networks satisfy specifications describing their input-output behavior. These neural networks and specifications cover a variety of problem classes and tasks, corresponding to safety and robustness properties in image classification, neural control, reinforcement learning, and autonomous systems. We summarize the key processes, rules, and results, present trends observed over the last three years, and provide an outlook into possible future developments.
Adversarial machine learning: a review of methods, tools, and critical industry sectors
The rapid advancement of Artificial Intelligence (AI), particularly Machine Learning (ML) and Deep Learning (DL), has produced high-performance models widely used in various applications, ranging from image recognition and chatbots to autonomous driving and smart grid systems. However, security threats arise from the vulnerabilities of ML models to adversarial attacks and data poisoning, posing risks such as system malfunctions and decision errors. Meanwhile, data privacy concerns arise, especially with personal data being used in model training, which can lead to data breaches. This paper surveys the Adversarial Machine Learning (AML) landscape in modern AI systems, while focusing on the dual aspects of robustness and privacy. Initially, we explore adversarial attacks and defenses using comprehensive taxonomies. Subsequently, we investigate robustness benchmarks alongside open-source AML technologies and software tools that ML system stakeholders can use to develop robust AI systems. Lastly, we delve into the landscape of AML in four industry fields –automotive, digital healthcare, electrical power and energy systems (EPES), and Large Language Model (LLM)-based Natural Language Processing (NLP) systems– analyzing attacks, defenses, and evaluation concepts, thereby offering a holistic view of the modern AI-reliant industry and promoting enhanced ML robustness and privacy preservation in the future.
AROID: Improving Adversarial Robustness Through Online Instance-Wise Data Augmentation
Deep neural networks are vulnerable to adversarial examples. Adversarial training (AT) is an effective defense against adversarial examples. However, AT is prone to overfitting which degrades robustness substantially. Recently, data augmentation (DA) was shown to be effective in mitigating robust overfitting if appropriately designed and optimized for AT. This work proposes a new method to automatically learn online, instance-wise, DA policies to improve robust generalization for AT. This is the first automated DA method specific for robustness. A novel policy learning objective, consisting of Vulnerability, Affinity and Diversity, is proposed and shown to be sufficiently effective and efficient to be practical for automatic DA generation during AT. Importantly, our method dramatically reduces the cost of policy search from the 5000 h of AutoAugment and the 412 h of IDBH to 9 h, making automated DA more practical to use for adversarial robustness. This allows our method to efficiently explore a large search space for a more effective DA policy and evolve the policy as training progresses. Empirically, our method is shown to outperform all competitive DA methods across various model architectures and datasets. Our DA policy reinforced vanilla AT to surpass several state-of-the-art AT methods regarding both accuracy and robustness. It can also be combined with those advanced AT methods to further boost robustness. Code and pre-trained models are available at: https://github.com/TreeLLi/AROID .
Adversarial robustness improvement for deep neural networks
Deep neural networks (DNNs) are key components for the implementation of autonomy in systems that operate in highly complex and unpredictable environments (self-driving cars, smart traffic systems, smart manufacturing, etc.). It is well known that DNNs are vulnerable to adversarial examples, i.e. minimal and usually imperceptible perturbations, applied to their inputs, leading to false predictions. This threat poses critical challenges, especially when DNNs are deployed in safety or security-critical systems, and renders as urgent the need for defences that can improve the trustworthiness of DNN functions. Adversarial training has proven effective in improving the robustness of DNNs against a wide range of adversarial perturbations. However, a general framework for adversarial defences is needed that will extend beyond a single-dimensional assessment of robustness improvement; it is essential to consider simultaneously several distance metrics and adversarial attack strategies. Using such an approach we report the results from extensive experimentation on adversarial defence methods that could improve DNNs resilience to adversarial threats. We wrap up by introducing a general adversarial training methodology, which, according to our experimental results, opens prospects for an holistic defence against a range of diverse types of adversarial perturbations.
On the role of deep learning model complexity in adversarial robustness for medical images
Background Deep learning (DL) models are highly vulnerable to adversarial attacks for medical image classification. An adversary could modify the input data in imperceptible ways such that a model could be tricked to predict, say, an image that actually exhibits malignant tumor to a prediction that it is benign. However, adversarial robustness of DL models for medical images is not adequately studied. DL in medicine is inundated with models of various complexity—particularly, very large models. In this work, we investigate the role of model complexity in adversarial settings. Results Consider a set of DL models that exhibit similar performances for a given task. These models are trained in the usual manner but are not trained to defend against adversarial attacks. We demonstrate that, among those models, simpler models of reduced complexity show a greater level of robustness against adversarial attacks than larger models that often tend to be used in medical applications. On the other hand, we also show that once those models undergo adversarial training, the adversarial trained medical image DL models exhibit a greater degree of robustness than the standard trained models for all model complexities. Conclusion The above result has a significant practical relevance. When medical practitioners lack the expertise or resources to defend against adversarial attacks, we recommend that they select the smallest of the models that exhibit adequate performance. Such a model would be naturally more robust to adversarial attacks than the larger models.
Physics-Aware Spatiotemporal Consistency for Transferable Defense of Autonomous Driving Perception
Autonomous driving perception systems are vulnerable to physical adversarial attacks. Existing defenses largely adopt loosely coupled architectures where visual and kinematic cues are processed in isolation, thus failing to exploit physical spatiotemporal consistency as a structural prior and often struggling to balance adversarial robustness, transferability, accuracy, and efficiency under realistic attacks. We propose a physics-aware trajectory–appearance consistency defense that detects and corrects spatiotemporal inconsistencies by tightly coupling visual semantics with physical dynamics. The module combines a dual-stream spatiotemporal encoder with endogenous feature orchestration and a frequency-domain kinematic embedding, turning tracking artifacts that are usually discarded as noise into discriminative cues. These inconsistencies are quantified by a Trajectory–Appearance Mutual Exclusion (TAME) energy, which supports a physics-aware switching rule to override flawed visual predictions. Operating on detector backbone features, outputs, and tracking states, the defense can be attached as a plug-in module behind diverse object detectors. Experiments on nuScenes, KITTI, and BDD100K show that the proposed defense substantially improves robustness against diverse categories of attacks: on nuScenes, it improves Correction Accuracy (CA) from 86.5% to 92.1% while reducing the computational overhead from 42 ms to 19 ms. Furthermore, the proposed defense maintains over 71.0% CA when transferred to unseen detectors and sustaining 72.4% CA under adaptive attackers.
RobEns: Robust Ensemble Adversarial Machine Learning Framework for Securing IoT Traffic
Recently, Machine Learning (ML)-based solutions have been widely adopted to tackle the wide range of security challenges that have affected the progress of the Internet of Things (IoT) in various domains. Despite the reported promising results, the ML-based Intrusion Detection System (IDS) proved to be vulnerable to adversarial examples, which pose an increasing threat. In fact, attackers employ Adversarial Machine Learning (AML) to cause severe performance degradation and thereby evade detection systems. This promoted the need for reliable defense strategies to handle performance and ensure secure networks. This work introduces RobEns, a robust ensemble framework that aims at: (i) exploiting state-of-the-art ML-based models alongside ensemble models for IDSs in the IoT network; (ii) investigating the impact of evasion AML attacks against the provided models within a black-box scenario; and (iii) evaluating the robustness of the considered models after deploying relevant defense methods. In particular, four typical AML attacks are considered to investigate six ML-based IDSs using three benchmarking datasets. Moreover, multi-class classification scenarios are designed to assess the performance of each attack type. The experiments indicated a drastic drop in detection accuracy for some attempts. To harden the IDS even further, two defense mechanisms were derived from both data-based and model-based methods. Specifically, these methods relied on feature squeezing as well as adversarial training defense strategies. They yielded promising results, enhanced robustness, and maintained standard accuracy in the presence or absence of adversaries. The obtained results proved the efficiency of the proposed framework in robustifying IDS performance within the IoT context. In particular, the accuracy reached 100% for black-box attack scenarios while preserving the accuracy in the absence of attacks as well.
Securing online integrity: a hybrid approach to deepfake detection and removal using Explainable AI and Adversarial Robustness Training
As deepfake technology becomes increasingly sophisticated, the proliferation of manipulated images presents a significant threat to online integrity, requiring advanced detection and mitigation strategies. Addressing this critical challenge, our study introduces a pioneering approach that integrates Explainable AI (XAI) with Adversarial Robustness Training (ART) to enhance the detection and removal of deepfake content. The proposed methodology, termed XAI-ART, begins with the creation of a diverse dataset that includes both authentic and manipulated images, followed by comprehensive preprocessing and augmentation. We then employ Adversarial Robustness Training to fortify the deep learning model against adversarial manipulations. By incorporating Explainable AI techniques, our approach not only improves detection accuracy but also provides transparency in model decision-making, offering clear insights into how deepfake content is identified. Our experimental results underscore the effectiveness of XAI-ART, with the model achieving an impressive accuracy of 97.5% in distinguishing between genuine and manipulated images. The recall rate of 96.8% indicates that our model effectively captures the majority of deepfake instances, while the F1-Score of 97.5% demonstrates a well-balanced performance in precision and recall. Importantly, the model maintains high robustness against adversarial attacks, with a minimal accuracy reduction to 96.7% under perturbations.
Adversarial supervised contrastive learning
Contrastive learning is prevalently used in pre-training deep models, followed with fine-tuning in downstream tasks for better performance or faster training. However, pre-trained models from contrastive learning are barely robust against adversarial examples in downstream tasks since the representations learned by self-supervision may lack the robustness and also the class-wise discrimination. To tackle the above problems, we adapt the contrastive learning scheme to adversarial examples for robustness enhancement, and also extend the self-supervised contrastive approach to the supervised setting for the ability to discriminate on classes. Equipped with our new designs, we proposed adversarial supervised contrastive learning (ASCL), a novel framework for robust pre-training. Despite its simplicity, extensive experiments show that ASCL achieves significant margins in adversarial robustness over the prior arts, proceeding towards either the lightweight standard fine-tuning or adversarial fine-tuning. Moreover, ASCL also shows benefits for robustness to diverse natural corruptions, suggesting the wide applicability to all sorts of practical scenarios. Notably, ASCL demonstrate impressive results in robust transfer learning.