Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
38,910 result(s) for "Image detection"
Sort by:
A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection
Remote sensing image change detection (CD) is done to identify desired significant changes between bitemporal images. Given two co-registered images taken at different times, the illumination variations and misregistration errors overwhelm the real object changes. Exploring the relationships among different spatial–temporal pixels may improve the performances of CD methods. In our work, we propose a novel Siamese-based spatial–temporal attention neural network. In contrast to previous methods that separately encode the bitemporal images without referring to any useful spatial–temporal dependency, we design a CD self-attention mechanism to model the spatial–temporal relationships. We integrate a new CD self-attention module in the procedure of feature extraction. Our self-attention module calculates the attention weights between any two pixels at different times and positions and uses them to generate more discriminative features. Considering that the object may have different scales, we partition the image into multi-scale subregions and introduce the self-attention in each subregion. In this way, we could capture spatial–temporal dependencies at various scales, thereby generating better representations to accommodate objects of various sizes. We also introduce a CD dataset LEVIR-CD, which is two orders of magnitude larger than other public datasets of this field. LEVIR-CD consists of a large set of bitemporal Google Earth images, with 637 image pairs (1024 × 1024) and over 31 k independently labeled change instances. Our proposed attention module improves the F1-score of our baseline model from 83.9 to 87.3 with acceptable computational overhead. Experimental results on a public remote sensing image CD dataset show our method outperforms several other state-of-the-art methods.
Comic Image Detection Based on MA‐YOLOv8s
In recent years, the plagiarism of comic images has become increasingly prevalent, drawing growing attention to copyright protection within the comic industry. To address the limitations of existing object detection models in capturing the distinctive visual characteristics of comic images, this paper proposes an optimized detection framework, MANGA‐YOLOv8s (MA‐YOLOv8s). Specifically, a large separable kernel attention‐based spatial pyramid pooling (SPPF‐LSKA) module is designed to expand the effective receptive field and enhance multiscale feature aggregation for small‐object detection. The C2f‐DBB module is introduced into the detection head to refine deep feature representation while maintaining lightweight computation. Furthermore, a separated and enhancement attention module (SEAM) is incorporated into the detection heads to improve robustness against scale variation and suppress false detections. Unlike simple combinations of existing modules, these designs form a theoretically motivated and task‐specific integration that adapts the YOLOv8 framework to the structural and stylistic characteristics of comic images. Experiments on the Manga109 dataset demonstrate that MA‐YOLOv8s achieves a 3.7% improvement in mAP and a 3.4% increase in precision compared with YOLOv8s. The proposed method offers both theoretical and practical contributions to the development of efficient detection techniques for comic copyright protection.
A Texture Feature Removal Network for Sonar Image Classification and Detection
Deep neural network (DNN) was applied in sonar image target recognition tasks, but it is very difficult to obtain enough sonar images that contain a target; as a result, the direct use of a small amount of data to train a DNN will cause overfitting and other problems. Transfer learning is the most effective way to address such scenarios. However, there is a large domain gap between optical images and sonar images, and common transfer learning methods may not be able to effectively handle it. In this paper, we propose a transfer learning method for sonar image classification and object detection called the texture feature removal network. We regard the texture features of an image as domain-specific features, and we narrow the domain gap by discarding the domain-specific features, and hence, make it easier to complete knowledge transfer. Our method can be easily embedded into other transfer learning methods, which makes it easier to apply to different application scenarios. Experimental results show that our method is effective in side-scan sonar image classification tasks and forward-looking sonar image detection tasks. For side-scan sonar image classification tasks, the classification accuracy of our method is enhanced by 4.5% in a supervised learning experiment, and for forward-looking sonar detection tasks, the average precision (AP) is also significantly improved.
MFIL-FCOS: A Multi-Scale Fusion and Interactive Learning Method for 2D Object Detection and Remote Sensing Image Detection
Object detection is dedicated to finding objects in an image and estimate their categories and locations. Recently, object detection algorithms suffer from a loss of semantic information in the deeper feature maps due to the deepening of the backbone network. For example, when using complex backbone networks, existing feature fusion methods cannot fuse information from different layers effectively. In addition, anchor-free object detection methods fail to accurately predict the same object due to the different learning mechanisms of the regression and centrality of the prediction branches. To address the above problem, we propose a multi-scale fusion and interactive learning method for fully convolutional one-stage anchor-free object detection, called MFIL-FCOS. Specifically, we designed a multi-scale fusion module to address the problem of local semantic information loss in high-level feature maps which strengthen the ability of feature extraction by enhancing the local information of low-level features and fusing the rich semantic information of high-level features. Furthermore, we propose an interactive learning module to increase the interactivity and more accurate predictions by generating a centrality-position weight adjustment regression task and a centrality prediction task. Following these strategic improvements, we conduct extensive experiments on the COCO and DIOR datasets, demonstrating its superior capabilities in 2D object detection tasks and remote sensing image detection, even under challenging conditions.
Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform
The Canny operator is widely used to detect edges in images. However, as the size of the image dataset increases, the edge detection performance of the Canny operator decreases and its runtime becomes excessive. To improve the runtime and edge detection performance of the Canny operator, in this paper, we propose a parallel design and implementation for an Otsu-optimized Canny operator using a MapReduce parallel programming model that runs on the Hadoop platform. The Otsu algorithm is used to optimize the Canny operator’s dual threshold and improve the edge detection performance, while the MapReduce parallel programming model facilitates parallel processing for the Canny operator to solve the processing speed and communication cost problems that occur when the Canny edge detection algorithm is applied to big data. For the experiments, we constructed datasets of different scales from the Pascal VOC2012 image database. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images. Overall, our approach system speeds up the system by approximately 3.4 times when processing large-scale datasets, which demonstrates the obvious superiority of our method. The proposed algorithm in this study demonstrates both better edge detection performance and improved time performance.
Detecting Images in Two-Operator Series Manipulation: A Novel Approach Using Transposed Convolution and Information Fusion
Digital image forensics is a crucial emerging technique, as image editing tools can modify them easily. Most of the latest methods can determine whether a specific operator has edited an image. These methods are suitable for high-resolution uncompressed images. In practice, more than one operator is used to modify image contents repeatedly. In this paper, a reliable scheme using information fusion and deep network networks is presented to recognize manipulation operators and the operator’s series on two operators. A transposed convolutional layer improves the performance of low-resolution JPEG compressed images. In addition, a bottleneck technique is utilized to extend the number of transposed convolutional layers. One average pooling layer is employed to preserve the optimal information flow and evade the overfitting concern among the layers. Moreover, the presented scheme can detect two operator series with various factors without including them in training. The experimental outcomes of the suggested scheme are encouraging and better than the existing schemes due to the availability of sufficient statistical evidence.
Enhancing Low-Pass Filtering Detection on Small Digital Images Using Hybrid Deep Learning
Detecting image manipulation is essential for investigating the processing history of digital images. In this paper, a novel scheme is proposed to detect the use of low-pass filters in image processing. A new convolutional neural network with a reasonable size was designed to identify three types of low-pass filters. The learning experiences of the three solvers were combined to enhance the detection ability of the proposed approach. Global pooling layers were employed to protect the information loss between the convolutional layers, and a new global variance pooling layer was introduced to improve detection accuracy. The extracted features from the convolutional neural network were mapped to the frequency domain to enrich the feature set. A leaky Rectified Linear Unit (ReLU) layer was discovered to perform better than the traditional ReLU layer. A tri-layered neural network classifier was employed to classify low-pass filters with various parameters into two, four, and ten classes. As detecting low-pass filtering is relatively easy on large-dimension images, the experimental environment was restricted to small images of 30 × 30 and 60 × 60 pixels. The proposed scheme achieved 80.12% and 90.65% detection accuracy on ten categories of images compressed with JPEG and a quality factor 75 on 30 × 30 and 60 × 60 images, respectively.
Deep learning-based efficient and robust image forgery detection
The identification of the tampered images is the research area of digital forensics, which defines the manipulation in the image. This is easily performed by forgers without the permission of the owners of the images. The misuse of these images in digital content is a significant problem for privacy. There are some studies on the detection of the forgery images, in the literature. However, effective and robust solutions are needed to detect them. In this motivation, a deep learning based architecture is proposed to solve the tampered image detection. It includes feature extraction with transfer learning architecture, selection of features with particle swarm optimization, and multi-classification of images with a deep learning architecture that is designed with Gated Recurrent Units. The proposed architecture is validated with the modified CASIA dataset. Various noisy tampered images are also included in the dataset of the study to demonstrate the robustness of the method. Despite this, tampered and noisy tampered images can even be detected quite accurately. As a result of the experiments, 96.25% accuracy is achieved with the proposed method. An accuracy of 80.5% was achieved in images with Gaussian and salt & pepper noises together. It has been proven through experiments that these results are significantly higher than the SVM classifier. This achievement is capable of supporting the detection of the original image combined with different images by experts working in the field of digital forensics.
Unmasking AI-created visual content: a review of generated images and deepfake detection technologies
In this era, digital images and videos are ubiquitous in people’s lives, and generative models can easily produce high-quality images and videos. These images and videos enrich people’s lives and play important roles in various fields. However, maliciously generated images and videos can mislead the public, manipulate public opinion, invade privacy, and even lead to illegal activities. Therefore, detecting AI-created visual content has become a significant research topic in the field of multimedia information security. In recent years, the rapid development of deep learning technology has greatly accelerated the progress of AI-created visual content detection. This survey introduces the detection technologies for AI-created visual content that have developed in recent years, divided into two parts: AI-generated image detection and deepfake detection. In the AI-generated image detection section, we introduce current generative models and basic detection frameworks, and overview existing detection methods from the perspectives of unimodal and multimodal. In the deepfake detection section, we provide an overview of existing deepfake generation technique classifications, commonly used datasets, followed by some common evaluation metrics within the field. We also analyze the technical characteristics of existing methods based on the different feature information they utilize, summarizing and categorizing them. Finally, we propose future research directions and conclusions, offering suggestions for the development of AI-created visual content detection technologies.
On Plant Detection of Intact Tomato Fruits Using Image Analysis and Machine Learning Methods
Fully automated yield estimation of intact fruits prior to harvesting provides various benefits to farmers. Until now, several studies have been conducted to estimate fruit yield using image-processing technologies. However, most of these techniques require thresholds for features such as color, shape and size. In addition, their performance strongly depends on the thresholds used, although optimal thresholds tend to vary with images. Furthermore, most of these techniques have attempted to detect only mature and immature fruits, although the number of young fruits is more important for the prediction of long-term fluctuations in yield. In this study, we aimed to develop a method to accurately detect individual intact tomato fruits including mature, immature and young fruits on a plant using a conventional RGB digital camera in conjunction with machine learning approaches. The developed method did not require an adjustment of threshold values for fruit detection from each image because image segmentation was conducted based on classification models generated in accordance with the color, shape, texture and size of the images. The results of fruit detection in the test images showed that the developed method achieved a recall of 0.80, while the precision was 0.88. The recall values of mature, immature and young fruits were 1.00, 0.80 and 0.78, respectively.