Catalogue Search | MBRL

Land cover classification from remote sensing images based on multi-scale fully convolutional network

by Zhang, Ce , Zheng, Shunyi , Wang, Libo in Ablation , Artificial neural networks , Datasets

2022

Although the Convolutional Neural Network (CNN) has shown great potential for land cover classification, the frequently used single-scale convolution kernel limits the scope of information extraction. Therefore, we propose a Multi-Scale Fully Convolutional Network (MSFCN) with a multi-scale convolutional kernel as well as a Channel Attention Block (CAB) and a Global Pooling Module (GPM) in this paper to exploit discriminative representations from two-dimensional (2D) satellite images. Meanwhile, to explore the ability of the proposed MSFCN for spatio-temporal images, we expand our MSFCN to three-dimension using three-dimensional (3D) CNN, capable of harnessing each land cover category's time series interaction from the reshaped spatio-temporal remote sensing images. To verify the effectiveness of the proposed MSFCN, we conduct experiments on two spatial datasets and two spatio-temporal datasets. The proposed MSFCN achieves 60.366% on the WHDLD dataset and 75.127% on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753% and 77.156%. Extensive comparative experiments and ablation studies demonstrate the effectiveness of the proposed MSFCN. Code will be available at https://github.com/lironui/MSFCN .

Journal Article

Share this book

Add to My Shelf

A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection

by Shi, Zhenwei , Chen, Hao in Algorithms , attention mechanism , Change detection

2020

Remote sensing image change detection (CD) is done to identify desired significant changes between bitemporal images. Given two co-registered images taken at different times, the illumination variations and misregistration errors overwhelm the real object changes. Exploring the relationships among different spatial–temporal pixels may improve the performances of CD methods. In our work, we propose a novel Siamese-based spatial–temporal attention neural network. In contrast to previous methods that separately encode the bitemporal images without referring to any useful spatial–temporal dependency, we design a CD self-attention mechanism to model the spatial–temporal relationships. We integrate a new CD self-attention module in the procedure of feature extraction. Our self-attention module calculates the attention weights between any two pixels at different times and positions and uses them to generate more discriminative features. Considering that the object may have different scales, we partition the image into multi-scale subregions and introduce the self-attention in each subregion. In this way, we could capture spatial–temporal dependencies at various scales, thereby generating better representations to accommodate objects of various sizes. We also introduce a CD dataset LEVIR-CD, which is two orders of magnitude larger than other public datasets of this field. LEVIR-CD consists of a large set of bitemporal Google Earth images, with 637 image pairs (1024 × 1024) and over 31 k independently labeled change instances. Our proposed attention module improves the F1-score of our baseline model from 83.9 to 87.3 with acceptable computational overhead. Experimental results on a public remote sensing image CD dataset show our method outperforms several other state-of-the-art methods.

Journal Article

Share this book

Add to My Shelf

Efficient Patch-Wise Semantic Segmentation for Large-Scale Remote Sensing Images

by Liu, Yan , Ding, Meng , Geng, Jiahui in fully convolutional network , image segmentation , multi-scale

2018

Efficient and accurate semantic segmentation is the key technique for automatic remote sensing image analysis. While there have been many segmentation methods based on traditional hand-craft feature extractors, it is still challenging to process high-resolution and large-scale remote sensing images. In this work, a novel patch-wise semantic segmentation method with a new training strategy based on fully convolutional networks is presented to segment common land resources. First, to handle the high-resolution image, the images are split as local patches and then a patch-wise network is built. Second, training data is preprocessed in several ways to meet the specific characteristics of remote sensing images, i.e., color imbalance, object rotation variations and lens distortion. Third, a multi-scale training strategy is developed to solve the severe scale variation problem. In addition, the impact of conditional random field (CRF) is studied to improve the precision. The proposed method was evaluated on a dataset collected from a capital city in West China with the Gaofen-2 satellite. The dataset contains ten common land resources (Grassland, Road, etc.). The experimental results show that the proposed algorithm achieves 54.96% in terms of mean intersection over union (MIoU) and outperforms other state-of-the-art methods in remote sensing image segmentation.

Journal Article

Share this book

Add to My Shelf

Building Footprint Extraction from High-Resolution Images via Spatial Residual Inception Convolutional Neural Network

by Liu, Mengxi , Yang, Jinxing , Liu, Penghua in Artificial neural networks , Benchmarks , building footprints extraction

2019

The rapid development in deep learning and computer vision has introduced new opportunities and paradigms for building extraction from remote sensing images. In this paper, we propose a novel fully convolutional network (FCN), in which a spatial residual inception (SRI) module is proposed to capture and aggregate multi-scale contexts for semantic understanding by successively fusing multi-level features. The proposed SRI-Net is capable of accurately detecting large buildings that might be easily omitted while retaining global morphological characteristics and local details. On the other hand, to improve computational efficiency, depthwise separable convolutions and convolution factorization are introduced to significantly decrease the number of model parameters. The proposed model is evaluated on the Inria Aerial Image Labeling Dataset and the Wuhan University (WHU) Aerial Building Dataset. The experimental results show that the proposed methods exhibit significant improvements compared with several state-of-the-art FCNs, including SegNet, U-Net, RefineNet, and DeepLab v3+. The proposed model shows promising potential for building detection from remote sensing images on a large scale.

Journal Article

Share this book

Add to My Shelf

Encoding Time Series as Multi-Scale Signed Recurrence Plots for Classification Using Fully Convolutional Networks

by Ouyang, Kewei , Hou, Yi , Zhang, Ye in Accuracy , Algorithms , Classification

2020

Recent advances in time series classification (TSC) have exploited deep neural networks (DNN) to improve the performance. One promising approach encodes time series as recurrence plot (RP) images for the sake of leveraging the state-of-the-art DNN to achieve accuracy. Such an approach has been shown to achieve impressive results, raising the interest of the community in it. However, it remains unsolved how to handle not only the variability in the distinctive region scale and the length of sequences but also the tendency confusion problem. In this paper, we tackle the problem using Multi-scale Signed Recurrence Plots (MS-RP), an improvement of RP, and propose a novel method based on MS-RP images and Fully Convolutional Networks (FCN) for TSC. This method first introduces phase space dimension and time delay embedding of RP to produce multi-scale RP images; then, with the use of asymmetrical structure, constructed RP images can represent very long sequences (>700 points). Next, MS-RP images are obtained by multiplying designed sign masks in order to remove the tendency confusion. Finally, FCN is trained with MS-RP images to perform classification. Experimental results on 45 benchmark datasets demonstrate that our method improves the state-of-the-art in terms of classification accuracy and visualization evaluation.

Journal Article

Share this book

Add to My Shelf

FMnet: Iris Segmentation and Recognition by Using Fully and Multi-Scale CNN for Biometric Security

by Ayoub, Naeem , Tobji, Rachida , Di, Wu in Algorithms , Artificial intelligence , Biometrics

2019

In Deep Learning, recent works show that neural networks have a high potential in the field of biometric security. The advantage of using this type of architecture, in addition to being robust, is that the network learns the characteristic vectors by creating intelligent filters in an automatic way, grace to the layers of convolution. In this paper, we propose an algorithm “FMnet” for iris recognition by using Fully Convolutional Network (FCN) and Multi-scale Convolutional Neural Network (MCNN). By taking into considerations the property of Convolutional Neural Networks to learn and work at different resolutions, our proposed iris recognition method overcomes the existing issues in the classical methods which only use handcrafted features extraction, by performing features extraction and classification together. Our proposed algorithm shows better classification results as compared to the other state-of-the-art iris recognition approaches.

Journal Article

Share this book

Add to My Shelf

DMINet: dense multi-scale inference network for salient object detection

by Ge, Bin , Gao, Xiuju , Sun, Yanguang in Artificial Intelligence , Computer Graphics , Computer Science

2022

Although the salient object detection (SOD) methods based on fully convolutional networks have made extraordinary achievements, it is still a challenge to accurately detect salient objects with complicated structure from cluttered real-world scenes due to their rarely considering the effectiveness and correlation of the captured different scale context and how to efficient interaction of complementary information. Motivate by this, in this paper, a novel Dense Multi-scale Inference Network (DMINet) is proposed for the accurate SOD task, which mainly consists of a dual-stream multi-receptive field module and a residual multi-mode interaction strategy. The former uses the well-designed different receptive field convolution operations and dense guidance connections to efficiently capture and utilize multi-scale contextual features for better salient objects inferring, while the latter adopts diverse interaction manners to adequately interact complementary information from multi-level features, generating powerful feature representations for predicting high-quality saliency maps. Quantitative and qualitative comparison results on five SOD datasets convincingly demonstrate that our DMINet performs favorably compared with 17 state-of-the-art SOD methods under different evaluation metrics.

Journal Article

Share this book

Add to My Shelf

Mapping and Discriminating Rural Settlements Using Gaofen-2 Images and a Fully Convolutional Network

by Huang, Lu , Ye, Ziran , Wang, Ke in China , Classification , Decision Making

2020

New ongoing rural construction has resulted in an extensive mixture of new settlements with old ones in the rural areas of China. Understanding the spatial characteristic of these rural settlements is of crucial importance as it provides essential information for land management and decision-making. Despite a great advance in High Spatial Resolution (HSR) satellite images and deep learning techniques, it remains a challenging task for mapping rural settlements accurately because of their irregular morphology and distribution pattern. In this study, we proposed a novel framework to map rural settlements by leveraging the merits of Gaofen-2 HSR images and representation learning of deep learning. We combined a dilated residual convolutional network (Dilated-ResNet) and a multi-scale context subnetwork into an end-to-end architecture in order to learn high resolution feature representations from HSR images and to aggregate and refine the multi-scale features extracted by the aforementioned network. Our experiment in Tongxiang city showed that the proposed framework effectively mapped and discriminated rural settlements with an overall accuracy of 98% and Kappa coefficient of 85%, achieving comparable and improved performance compared to other existing methods. Our results bring tangible benefits to support other convolutional neural network (CNN)-based methods in accurate and timely rural settlement mapping, particularly when up-to-date ground truth is absent. The proposed method does not only offer an effective way to extract rural settlement from HSR images but open a new opportunity to obtain spatial-explicit understanding of rural settlements.

Journal Article

Share this book

Add to My Shelf

A neural network ensemble method for effective crack segmentation using fully convolutional networks and multi-scale structured forests

by Zhang, Yinghui , Liu, Xiaoqin , Wang, Sen in Communications Engineering , Computer Science , Convolution

2020

Crack image segmentation has recently become a major research topic in nondestructive inspection. However, the image segmentation methods are not robust to variations such as illumination, weather, noise and the segmentation accuracy which cannot meet the requirements of practical applications. Therefore, a neural network ensemble method is proposed for effective crack segmentation in this paper, which consists of fully convolution networks (FCN) and multi-scale structured forests for edge detection (SFD). In order to improve the accuracy of crack segmentation and reduce the error mark under complex background, a new network model based on FCN model is proposed to address the problems that lose local information and the capacity of partial refinement, which are frequently encountered in FCN model in the crack segmentation. In addition, SFD is combined with the half-reconstruction method of anti-symmetrical bi-orthogonal wavelet to overcome the limitation of crack edge detection. Finally, the result of the two maps is merged after resizing to the original image dimensions. Qualitative and quantitative evaluations of the proposed methods are performed, showing that they can obtain better results than certain existing methods for crack segmentation.

Journal Article

Share this book

Add to My Shelf

Semantic image segmentation using fully convolutional neural networks with multi-scale images and multi-scale dilated convolutions

by Sang-Woong, Lee , Vo, Duc My in Accuracy , Artificial neural networks , Computer architecture

2018

In this work, we investigate the effects of the cascade architecture of dilated convolutions and the deep network architecture of multi-resolution input images on the accuracy of semantic segmentation. We show that a cascade of dilated convolutions is not only able to efficiently capture larger context without increasing computational costs, but can also improve the localization performance. In addition, the deep network architecture for multi-resolution input images increases the accuracy of semantic segmentation by aggregating multi-scale contextual information. Furthermore, our fully convolutional neural network is coupled with a model of fully connected conditional random fields to further remove isolated false positives and improve the prediction along object boundaries. We present several experiments on two challenging image segmentation datasets, showing substantial improvements over strong baselines.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter