Catalogue Search | MBRL

Exploration of MPSO-Two-Stage Classification Optimization Model for Scene Images with Low Quality and Complex Semantics

by Wang, Rong , Liu, Kexin , Song, Xiaoou in Accuracy , Algorithms , Classification

2024

Currently, complex scene classification strategies are limited to high-definition image scene sets, and low-quality scene sets are overlooked. Although a few studies have focused on artificially noisy images or specific image sets, none have involved actual low-resolution scene images. Therefore, designing classification models around practicality is of paramount importance. To solve the above problems, this paper proposes a two-stage classification optimization algorithm model based on MPSO, thus achieving high-precision classification of low-quality scene images. Firstly, to verify the rationality of the proposed model, three groups of internationally recognized scene datasets were used to conduct comparative experiments with the proposed model and 21 existing methods. It was found that the proposed model performs better, especially in the 15-scene dataset, with 1.54% higher accuracy than the best existing method ResNet-ELM. Secondly, to prove the necessity of the pre-reconstruction stage of the proposed model, the same classification architecture was used to conduct comparative experiments between the proposed reconstruction method and six existing preprocessing methods on the seven self-built low-quality news scene frames. The results show that the proposed model has a higher improvement rate for outdoor scenes. Finally, to test the application potential of the proposed model in outdoor environments, an adaptive test experiment was conducted on the two self-built scene sets affected by lighting and weather. The results indicate that the proposed model is suitable for weather-affected scene classification, with an average accuracy improvement of 1.42%.

Journal Article

Share this book

Add to My Shelf

Stable polyp-scene classification via subsampling and residual learning from an imbalanced large dataset

by Itoh, Hayato , Oda, Masahiro , Misawa, Masashi in 3D CNN , Architecture , biological organs

2019

This Letter presents a stable polyp-scene classification method with low false positive (FP) detection. Precise automated polyp detection during colonoscopies is essential for preventing colon-cancer deaths. There is, therefore, a demand for a computer-assisted diagnosis (CAD) system for colonoscopies to assist colonoscopists. A high-performance CAD system with spatiotemporal feature extraction via a three-dimensional convolutional neural network (3D CNN) with a limited dataset achieved about 80% detection accuracy in actual colonoscopic videos. Consequently, further improvement of a 3D CNN with larger training data is feasible. However, the ratio between polyp and non-polyp scenes is quite imbalanced in a large colonoscopic video dataset. This imbalance leads to unstable polyp detection. To circumvent this, the authors propose an efficient and balanced learning technique for deep residual learning. The authors’ method randomly selects a subset of non-polyp scenes whose number is the same number of still images of polyp scenes at the beginning of each epoch of learning. Furthermore, they introduce post-processing for stable polyp-scene classification. This post-processing reduces the FPs that occur in the practical application of polyp-scene classification. They evaluate several residual networks with a large polyp-detection dataset consisting of 1027 colonoscopic videos. In the scene-level evaluation, their proposed method achieves stable polyp-scene classification with 0.86 sensitivity and 0.97 specificity.

Journal Article

Share this book

Add to My Shelf

Remote Sensing Image Scene Classification Using CNN-CapsNet

by Tang, Ping , Zhao, Lijun , Zhang, Wei in Accuracy , Architectural engineering , Artificial neural networks

2019

Remote sensing image scene classification is one of the most challenging problems in understanding high-resolution remote sensing images. Deep learning techniques, especially the convolutional neural network (CNN), have improved the performance of remote sensing image scene classification due to the powerful perspective of feature learning and reasoning. However, several fully connected layers are always added to the end of CNN models, which is not efficient in capturing the hierarchical structure of the entities in the images and does not fully consider the spatial information that is important to classification. Fortunately, capsule network (CapsNet), which is a novel network architecture that uses a group of neurons as a capsule or vector to replace the neuron in the traditional neural network and can encode the properties and spatial information of features in an image to achieve equivariance, has become an active area in the classification field in the past two years. Motivated by this idea, this paper proposes an effective remote sensing image scene classification architecture named CNN-CapsNet to make full use of the merits of these two models: CNN and CapsNet. First, a CNN without fully connected layers is used as an initial feature maps extractor. In detail, a pretrained deep CNN model that was fully trained on the ImageNet dataset is selected as a feature extractor in this paper. Then, the initial feature maps are fed into a newly designed CapsNet to obtain the final classification result. The proposed architecture is extensively evaluated on three public challenging benchmark remote sensing image datasets: the UC Merced Land-Use dataset with 21 scene categories, AID dataset with 30 scene categories, and the NWPU-RESISC45 dataset with 45 challenging scene categories. The experimental results demonstrate that the proposed method can lead to a competitive classification performance compared with the state-of-the-art methods.

Journal Article

Share this book

Add to My Shelf

Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery

by Hu, Fan , Zhang, Liangpei , Hu, Jingwen in Classification , Coding , convolutional layer

2015

Learning efficient image representations is at the core of the scene classification task of remote sensing imagery. The existing methods for solving the scene classification task, based on either feature coding approaches with low-level hand-engineered features or unsupervised feature learning, can only generate mid-level image features with limited representative ability, which essentially prevents them from achieving better performance. Recently, the deep convolutional neural networks (CNNs), which are hierarchical architectures trained on large-scale datasets, have shown astounding performance in object recognition and detection. However, it is still not clear how to use these deep convolutional neural networks for high-resolution remote sensing (HRRS) scene classification. In this paper, we investigate how to transfer features from these successfully pre-trained CNNs for HRRS scene classification. We propose two scenarios for generating image features via extracting CNN features from different layers. In the first scenario, the activation vectors extracted from fully-connected layers are regarded as the final image features; in the second scenario, we extract dense features from the last convolutional layer at multiple scales and then encode the dense features into global image features through commonly used feature coding approaches. Extensive experiments on two public scene classification datasets demonstrate that the image features obtained by the two proposed scenarios, even with a simple linear classifier, can result in remarkable performance and improve the state-of-the-art by a significant margin. The results reveal that the features from pre-trained CNNs generalize well to HRRS datasets and are more expressive than the low- and mid-level features. Moreover, we tentatively combine features extracted from different CNN models for better performance.

Journal Article

Share this book

Add to My Shelf

Multi-Branch Deep Learning Framework for Land Scene Classification in Satellite Imagery

by Khan, Sultan Daud , Basalamah, Saleh in aerial imagery , Artificial intelligence , Classification

2023

Land scene classification in satellite imagery has a wide range of applications in remote surveillance, environment monitoring, remote scene analysis, Earth observations and urban planning. Due to immense advantages of the land scene classification task, several methods have been proposed during recent years to automatically classify land scenes in remote sensing images. Most of the work focuses on designing and developing deep networks to identify land scenes from high-resolution satellite images. However, these methods face challenges in identifying different land scenes. Complex texture, cluttered background, extremely small size of objects and large variations in object scale are the common challenges that restrict the models to achieve high performance. To tackle these challenges, we propose a multi-branch deep learning framework that efficiently combines global contextual features with multi-scale features to identify complex land scenes. Generally, the framework consists of two branches. The first branch extracts global contextual information from different regions of the input image, and the second branch exploits a fully convolutional network (FCN) to extract multi-scale local features. The performance of the proposed framework is evaluated on three benchmark datasets, UC-Merced, SIRI-WHU, and EuroSAT. From the experiments, we demonstrate that the framework achieves superior performance compared to other similar models.

Journal Article

Share this book

Add to My Shelf

TRS: Transformers for Remote Sensing Scene Classification

by Zhang, Jianrong , Li, Jiao , Zhao, Hongwei in Accuracy , Artificial neural networks , Classification

2021

Remote sensing scene classification remains challenging due to the complexity and variety of scenes. With the development of attention-based methods, Convolutional Neural Networks (CNNs) have achieved competitive performance in remote sensing scene classification tasks. As an important method of the attention-based model, the Transformer has achieved great success in the field of natural language processing. Recently, the Transformer has been used for computer vision tasks. However, most existing methods divide the original image into multiple patches and encode the patches as the input of the Transformer, which limits the model’s ability to learn the overall features of the image. In this paper, we propose a new remote sensing scene classification method, Remote Sensing Transformer (TRS), a powerful “pure CNNs → Convolution + Transformer → pure Transformers” structure. First, we integrate self-attention into ResNet in a novel way, using our proposed Multi-Head Self-Attention layer instead of 3 × 3 spatial revolutions in the bottleneck. Then we connect multiple pure Transformer encoders to further improve the representation learning performance completely depending on attention. Finally, we use a linear classifier for classification. We train our model on four public remote sensing scene datasets: UC-Merced, AID, NWPU-RESISC45, and OPTIMAL-31. The experimental results show that TRS exceeds the state-of-the-art methods and achieves higher accuracy.

Journal Article

Share this book

Add to My Shelf

An Efficient and Lightweight Convolutional Neural Network for Remote Sensing Image Scene Classification

by Lin, Yuzhun , Yu, Donghang , Li, Daoji in bilinear model , convolutional neural network , MobileNet

2020

Classifying remote sensing images is vital for interpreting image content. Presently, remote sensing image scene classification methods using convolutional neural networks have drawbacks, including excessive parameters and heavy calculation costs. More efficient and lightweight CNNs have fewer parameters and calculations, but their classification performance is generally weaker. We propose a more efficient and lightweight convolutional neural network method to improve classification accuracy with a small training dataset. Inspired by fine-grained visual recognition, this study introduces a bilinear convolutional neural network model for scene classification. First, the lightweight convolutional neural network, MobileNetv2, is used to extract deep and abstract image features. Each feature is then transformed into two features with two different convolutional layers. The transformed features are subjected to Hadamard product operation to obtain an enhanced bilinear feature. Finally, the bilinear feature after pooling and normalization is used for classification. Experiments are performed on three widely used datasets: UC Merced, AID, and NWPU-RESISC45. Compared with other state-of-art methods, the proposed method has fewer parameters and calculations, while achieving higher accuracy. By including feature fusion with bilinear pooling, performance and accuracy for remote scene classification can greatly improve. This could be applied to any remote sensing image classification task.

Journal Article

Share this book

Add to My Shelf

Deep Learning for Remote Sensing Image Scene Classification: A Review and Meta-Analysis

by Neupane, Bipul , Thapa, Aakash , Horanont, Teerayut in Accuracy , Artificial neural networks , Classification

2023

Remote sensing image scene classification with deep learning (DL) is a rapidly growing field that has gained significant attention in the past few years. While previous review papers in this domain have been confined to 2020, an up-to-date review to show the progression of research extending into the present phase is lacking. In this review, we explore the recent articles, providing a thorough classification of approaches into three main categories: Convolutional Neural Network (CNN)-based, Vision Transformer (ViT)-based, and Generative Adversarial Network (GAN)-based architectures. Notably, within the CNN-based category, we further refine the classification based on specific methodologies and techniques employed. In addition, a novel and rigorous meta-analysis is performed to synthesize and analyze the findings from 50 peer-reviewed journal articles to provide valuable insights in this domain, surpassing the scope of existing review articles. Our meta-analysis shows that the most adopted remote sensing scene datasets are AID (41 articles) and NWPU-RESISC45 (40). A notable paradigm shift is seen towards the use of transformer-based models (6) starting from 2021. Furthermore, we critically discuss the findings from the review and meta-analysis, identifying challenges and future opportunities for improvement in this domain. Our up-to-date study serves as an invaluable resource for researchers seeking to contribute to this growing area of research.

Journal Article

Share this book

Add to My Shelf

Convolutional Neural Network for Remote-Sensing Scene Classification: Transfer Learning Analysis

by Marfurt, Kurt , Pires de Lima, Rafael in Aerial photography , artificial intelligence , Artificial neural networks

2020

Remote-sensing image scene classification can provide significant value, ranging from forest fire monitoring to land-use and land-cover classification. Beginning with the first aerial photographs of the early 20th century to the satellite imagery of today, the amount of remote-sensing data has increased geometrically with a higher resolution. The need to analyze these modern digital data motivated research to accelerate remote-sensing image classification. Fortunately, great advances have been made by the computer vision community to classify natural images or photographs taken with an ordinary camera. Natural image datasets can range up to millions of samples and are, therefore, amenable to deep-learning techniques. Many fields of science, remote sensing included, were able to exploit the success of natural image classification by convolutional neural network models using a technique commonly called transfer learning. We provide a systematic review of transfer learning application for scene classification using different datasets and different deep-learning models. We evaluate how the specialization of convolutional neural network models affects the transfer learning process by splitting original models in different points. As expected, we find the choice of hyperparameters used to train the model has a significant influence on the final performance of the models. Curiously, we find transfer learning from models trained on larger, more generic natural images datasets outperformed transfer learning from models trained directly on smaller remotely sensed datasets. Nonetheless, results show that transfer learning provides a powerful tool for remote-sensing scene classification.

Journal Article

Share this book

Add to My Shelf

Pre-Trained AlexNet Architecture with Pyramid Pooling and Supervision for High Spatial Resolution Remote Sensing Image Scene Classification

by Han, Xiaobing , Cao, Liqin , Zhang, Liangpei in Architectural engineering , Artificial neural networks , Big Data

2017

The rapid development of high spatial resolution (HSR) remote sensing imagery techniques not only provide a considerable amount of datasets for scene classification tasks but also request an appropriate scene classification choice when facing with finite labeled samples. AlexNet, as a relatively simple convolutional neural network (CNN) architecture, has obtained great success in scene classification tasks and has been proven to be an excellent foundational hierarchical and automatic scene classification technique. However, current HSR remote sensing imagery scene classification datasets always have the characteristics of small quantities and simple categories, where the limited annotated labeling samples easily cause non-convergence. For HSR remote sensing imagery, multi-scale information of the same scenes can represent the scene semantics to a certain extent but lacks an efficient fusion expression manner. Meanwhile, the current pre-trained AlexNet architecture lacks a kind of appropriate supervision for enhancing the performance of this model, which easily causes overfitting. In this paper, an improved pre-trained AlexNet architecture named pre-trained AlexNet-SPP-SS has been proposed, which incorporates the scale pooling—spatial pyramid pooling (SPP) and side supervision (SS) to improve the above two situations. Extensive experimental results conducted on the UC Merced dataset and the Google Image dataset of SIRI-WHU have demonstrated that the proposed pre-trained AlexNet-SPP-SS model is superior to the original AlexNet architecture as well as the traditional scene classification methods.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter