Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
348 result(s) for "adversarial examples"
Sort by:
Experiments on Adversarial Examples for Deep Learning Model Using Multimodal Sensors
Recently, artificial intelligence (AI) based on IoT sensors has been widely used, which has increased the risk of attacks targeting AI. Adversarial examples are among the most serious types of attacks in which the attacker designs inputs that can cause the machine learning system to generate incorrect outputs. Considering the architecture using multiple sensor devices, hacking even a few sensors can create a significant risk; an attacker can attack the machine learning model through the hacked sensors. Some studies demonstrated the possibility of adversarial examples on the deep neural network (DNN) model based on IoT sensors, but it was assumed that an attacker must access all features. The impact of hacking only a few sensors has not been discussed thus far. Therefore, in this study, we discuss the possibility of attacks on DNN models by hacking only a small number of sensors. In this scenario, the attacker first hacks few sensors in the system, obtains the values of the hacked sensors, and changes them to manipulate the system, but the attacker cannot obtain and change the values of the other sensors. We perform experiments using the human activity recognition model with three sensor devices attached to the chest, wrist, and ankle of a user, and demonstrate that attacks are possible by hacking a small number of sensors.
Revisiting model’s uncertainty and confidences for adversarial example detection
Security-sensitive applications that rely on Deep Neural Networks (DNNs) are vulnerable to small perturbations that are crafted to generate Adversarial Examples. The (AEs) are imperceptible to humans and cause DNN to misclassify them. Many defense and detection techniques have been proposed. Model’s confidences and Dropout, as a popular way to estimate the model’s uncertainty, have been used for AE detection but they showed limited success against black- and gray-box attacks. Moreover, the state-of-the-art detection techniques have been designed for specific attacks or broken by others, need knowledge about the attacks, are not consistent, increase model parameters overhead, are time-consuming, or have latency in inference time. To trade off these factors, we revisit the model’s uncertainty and confidences and propose a novel unsupervised ensemble AE detection mechanism that 1) uses the uncertainty method called SelectiveNet, 2) processes model layers outputs, i.e. feature maps, to generate new confidence probabilities. The detection method is called SFAD. Experimental results show that the proposed approach achieves better performance against black- and gray-box attacks than the state-of-the-art methods and achieves comparable performance against white-box attacks. Moreover, results show that SFAD is fully robust against High Confidence Attacks (HCAs) for MNIST and partially robust for CIFAR10 datasets. 1
Metrics and methods for robustness evaluation of neural networks with generative models
Recent studies have shown that modern deep neural network classifiers are easy to fool, assuming that an adversary is able to slightly modify their inputs. Many papers have proposed adversarial attacks, defenses and methods to measure robustness to such adversarial perturbations. However, most commonly considered adversarial examples are based on perturbations in the input space of the neural network that are unlikely to arise naturally. Recently, especially in computer vision, researchers discovered “natural” perturbations, such as rotations, changes of brightness, or more high-level changes, but these perturbations have not yet been systematically used to measure the performance of classifiers. In this paper, we propose several metrics to measure robustness of classifiers to natural adversarial examples, and methods to evaluate them. These metrics, called latent space performance metrics, are based on the ability of generative models to capture probability distributions. On four image classification case studies, we evaluate the proposed metrics for several classifiers, including ones trained in conventional and robust ways. We find that the latent counterparts of adversarial robustness are associated with the accuracy of the classifier rather than its conventional adversarial robustness, but the latter is still reflected on the properties of found latent perturbations. In addition, our novel method of finding latent adversarial perturbations demonstrates that these perturbations are often perceptually small.
Exploratory Research on Defense against Natural Adversarial Examples in Image Classification
The emergence of adversarial examples has revealed the inadequacies in the robustness of image classification models based on Convolutional Neural Networks (CNNs). Particularly in recent years, the discovery of natural adversarial examples has posed significant challenges, as traditional defense methods against adversarial attacks have proven to be largely ineffective against these natural adversarial examples. This paper explores defenses against these natural adversarial examples from three perspectives: adversarial examples, model architecture, and dataset. First, it employs Class Activation Mapping (CAM) to visualize how models classify natural adversarial examples, identifying several typical attack patterns. Next, various common CNN models are analyzed to evaluate their susceptibility to these attacks, revealing that different architectures exhibit varying defensive capabilities. The study finds that as the depth of a network increases, its defenses against natural adversarial examples strengthen. Lastly, Finally, the impact of dataset class distribution on the defense capability of models is examined, focusing on two aspects: the number of classes in the training set and the number of predicted classes. This study investigates how these factors influence the model’s ability to defend against natural adversarial examples. Results indicate that reducing the number of training classes enhances the model’s defense against natural adversarial examples. Additionally, under a fixed number of training classes, some CNN models show an optimal range of predicted classes for achieving the best defense performance against these adversarial examples.
Advancements in adversarial example defense for deep learning models: a review
Artificial intelligence technology based on deep learning has been widely used in key fields such as automatic driving, medical diagnosis and financial risk control. These applications also bring more and more serious security problems. In particular, as the means of attack continue to evolve, well-designed countermeasures seriously threaten the reliability of the model and the security of the system. In order to deal with this risk, defensive confrontation samples have become the core task of AI security research, playing a key role in improving the security and credibility of the model. Aiming at the problems of unclear concepts and overlapping standards in previous classification methods, this paper proposes a clearer and unified classification framework, combs and defines the contents of existing research, and solves the inconsistencies. The framework systematically divides the existing countermeasures and defense methods into three categories: detection, purification and optimization. This classification will help researchers understand the actual effects of different methods in the face of various attacks more clearly. This paper also analyzes the tradeoffs between accuracy, robustness, operational efficiency and generalization capability of various defense mechanisms, and reveals how they balance the calculation cost and actual deployment requirements. In addition, the paper points out the main challenges facing the current research, and puts forward the future research directions, including developing more efficient, adaptive, and cross modal defense methods to comprehensively improve the security of AI systems. The purpose of this review is to help researchers understand the development process of anti sample defense technology and provide a reference path for building a stable and reliable AI system.
Detecting Audio Adversarial Examples in Automatic Speech Recognition Systems Using Decision Boundary Patterns
Automatic Speech Recognition (ASR) systems are ubiquitous in various commercial applications. These systems typically rely on machine learning techniques for transcribing voice commands into text for further processing. Despite their success in many applications, audio Adversarial Examples (AEs) have emerged as a major security threat to ASR systems. This is because audio AEs are able to fool ASR models into producing incorrect results. While researchers have investigated methods for defending against audio AEs, the intrinsic properties of AEs and benign audio are not well studied. The work in this paper shows that the machine learning decision boundary patterns around audio AEs and benign audio are fundamentally different. Using dimensionality-reduction techniques, this work shows that these different patterns can be visually distinguished in two-dimensional (2D) space. This in turn allows for the detection of audio AEs using anomal- detection methods.
An Intelligent Secure Adversarial Examples Detection Scheme in Heterogeneous Complex Environments
Image-denoising techniques are widely used to defend against Adversarial Examples (AEs). However, denoising alone cannot completely eliminate adversarial perturbations. The remaining perturbations tend to amplify as they propagate through deeper layers of the network, leading to misclassifications. Moreover, image denoising compromises the classification accuracy of original examples. To address these challenges in AE defense through image denoising, this paper proposes a novel AE detection technique. The proposed technique combines multiple traditional image-denoising algorithms and Convolutional Neural Network (CNN) network structures. The used detector model integrates the classification results of different models as the input to the detector and calculates the final output of the detector based on a machine-learning voting algorithm. By analyzing the discrepancy between predictions made by the model on original examples and denoised examples, AEs are detected effectively. This technique reduces computational overhead without modifying the model structure or parameters, effectively avoiding the error amplification caused by denoising. The proposed approach demonstrates excellent detection performance against mainstream AE attacks. Experimental results show outstanding detection performance in well-known AE attacks, including Fast Gradient Sign Method (FGSM), Basic Iteration Method (BIM), DeepFool, and Carlini & Wagner (C&W), achieving a 94% success rate in FGSM detection, while only reducing the accuracy of clean examples by 4%.
Adversarial example detection for DNN models: a review and experimental comparison
Deep learning (DL) has shown great success in many human-related tasks, which has led to its adoption in many computer vision based applications, such as security surveillance systems, autonomous vehicles and healthcare. Such safety-critical applications have to draw their path to success deployment once they have the capability to overcome safety-critical challenges. Among these challenges are the defense against or/and the detection of the adversarial examples (AEs). Adversaries can carefully craft small, often imperceptible, noise called perturbations to be added to the clean image to generate the AE. The aim of AE is to fool the DL model which makes it a potential risk for DL applications. Many test-time evasion attacks and countermeasures, i.e., defense or detection methods, are proposed in the literature. Moreover, few reviews and surveys were published and theoretically showed the taxonomy of the threats and the countermeasure methods with little focus in AE detection methods. In this paper, we focus on image classification task and attempt to provide a survey for detection methods of test-time evasion attacks on neural network classifiers. A detailed discussion for such methods is provided with experimental results for eight state-of-the-art detectors under different scenarios on four datasets. We also provide potential challenges and future perspectives for this research direction.
Random Untargeted Adversarial Example on Deep Neural Network
Deep neural networks (DNNs) have demonstrated remarkable performance in machine learning areas such as image recognition, speech recognition, intrusion detection, and pattern analysis. However, it has been revealed that DNNs have weaknesses in the face of adversarial examples, which are created by adding a little noise to an original sample to cause misclassification by the DNN. Such adversarial examples can lead to fatal accidents in applications such as autonomous vehicles and disease diagnostics. Thus, the generation of adversarial examples has attracted extensive research attention recently. An adversarial example is categorized as targeted or untargeted. In this paper, we focus on the untargeted adversarial example scenario because it has a faster learning time and less distortion compared with the targeted adversarial example. However, there is a pattern vulnerability with untargeted adversarial examples: Because of the similarity between the original class and certain specific classes, it may be possible for the defending system to determine the original class by analyzing the output classes of the untargeted adversarial examples. To overcome this problem, we propose a new method for generating untargeted adversarial examples, one that uses an arbitrary class in the generation process. Moreover, we show that our proposed scheme can be applied to steganography. Through experiments, we show that our proposed scheme can achieve a 100% attack success rate with minimum distortion (1.99 and 42.32 using the MNIST and CIFAR10 datasets, respectively) and without the pattern vulnerability. Using a steganography test, we show that our proposed scheme can be used to fool humans, as demonstrated by the probability of their detecting hidden classes being equal to that of random selection.