Catalogue Search | MBRL

Automatic Building Segmentation of Aerial Imagery Using Multi-Constraint Fully Convolutional Networks

by Xu, Yongwei , Shibasaki, Ryosuke , Guo, Zhiling in aerial imagery , building detection , convolutional neural network

2018

Automatic building segmentation from aerial imagery is an important and challenging task because of the variety of backgrounds, building textures and imaging conditions. Currently, research using variant types of fully convolutional networks (FCNs) has largely improved the performance of this task. However, pursuing more accurate segmentation results is still critical for further applications such as automatic mapping. In this study, a multi-constraint fully convolutional network (MC–FCN) model is proposed to perform end-to-end building segmentation. Our MC–FCN model consists of a bottom-up/top-down fully convolutional architecture and multi-constraints that are computed between the binary cross entropy of prediction and the corresponding ground truth. Since more constraints are applied to optimize the parameters of the intermediate layers, the multi-scale feature representation of the model is further enhanced, and hence higher performance can be achieved. The experiments on a very-high-resolution aerial image dataset covering 18 km 2 and more than 17,000 buildings indicate that our method performs well in the building segmentation task. The proposed MC–FCN method significantly outperforms the classic FCN method and the adaptive boosting method using features extracted by the histogram of oriented gradients. Compared with the state-of-the-art U–Net model, MC–FCN gains 3.2% (0.833 vs. 0.807) and 2.2% (0.893 vs. 0.874) relative improvements of Jaccard index and kappa coefficient with the cost of only 1.8% increment of the model-training time. In addition, the sensitivity analysis demonstrates that constraints at different positions have inconsistent impact on the performance of the MC–FCN.

Journal Article

Share this book

Add to My Shelf

A Novel Deep Fully Convolutional Network for PolSAR Image Classification

by Chen, Yanqiao , Liu, Guangyuan , Jiao, Licheng in Artificial intelligence , Artificial neural networks , Classification

2018

Polarimetric synthetic aperture radar (PolSAR) image classification has become more and more popular in recent years. As we all know, PolSAR image classification is actually a dense prediction problem. Fortunately, the recently proposed fully convolutional network (FCN) model can be used to solve the dense prediction problem, which means that FCN has great potential in PolSAR image classification. However, there are some problems to be solved in PolSAR image classification by FCN. Therefore, we propose sliding window fully convolutional network and sparse coding (SFCN-SC) for PolSAR image classification. The merit of our method is twofold: (1) Compared with convolutional neural network (CNN), SFCN-SC can avoid repeated calculation and memory occupation; (2) Sparse coding is used to reduce the computation burden and memory occupation, and meanwhile the image integrity can be maintained in the maximum extent. We use three PolSAR images to test the performance of SFCN-SC. Compared with several state-of-the-art methods, SFCN-SC achieves promising results in PolSAR image classification.

Journal Article

Share this book

Add to My Shelf

Sustainable Urban Green Blue Space (UGBS) and Public Participation: Integrating Multisensory Landscape Perception from Online Reviews

by Danqing Li , Shuguang Ning , Jiao Zhang in Agriculture , Cities , Citizen participation

2023

The integration of multisensory-based public subjective perception into planning, management, and policymaking is of great significance for the sustainable development and protection of UGBS. Online reviews are a suitable data source for this issue, which includes information about public sentiment, perception of the physical environment, and sensory description. This study adopts the deep learning method to obtain effective information from online reviews and found that in 105 major sites of Tokyo (23 districts), the public overall perception level is not balanced. Rich multi-sense will promote the perception level, especially hearing and somatosensory senses that have a higher positive prediction effect than vision, and overall perception can start improving by optimizing these two senses. Even if only one adverse sense exists, it will seriously affect the perception level, such as bad smell and noise. Optimizing the physical environment by adding natural elements for different senses is conducive to overall perception. Sensory maps can help to quickly find areas that require improvement. This study provides a new method for rapid multisensory analysis and complementary public participation for specific situations, which helps to increase the well-being of UGBS and give play to its multi-functionality.

Journal Article

Share this book

Add to My Shelf

Adversarial Reconstruction-Classification Networks for PolSAR Image Classification

by Chen, Yanqiao , Shang, Ronghua , Jiao, Licheng in adversarial reconstruction-classification networks (ARCN) , Classification , Decomposition

2019

Polarimetric synthetic aperture radar (PolSAR) image classification has become more and more widely used in recent years. It is well known that PolSAR image classification is a dense prediction problem. The recently proposed fully convolutional networks (FCN) model, which is very good at dealing with the dense prediction problem, has great potential in resolving the task of PolSAR image classification. Nevertheless, for FCN, there are some problems to solve in PolSAR image classification. Fortunately, Li et al. proposed the sliding window fully convolutional networks (SFCN) model to tackle the problems of FCN in PolSAR image classification. However, only when the labeled training sample is sufficient, can SFCN achieve good classification results. To address the above mentioned problem, we propose adversarial reconstruction-classification networks (ARCN), which is based on SFCN and introduces reconstruction-classification networks (RCN) and adversarial training. The merit of our method is threefold: (i) A single composite representation that encodes information for supervised image classification and unsupervised image reconstruction can be constructed; (ii) By introducing adversarial training, the higher-order inconsistencies between the true image and reconstructed image can be detected and revised. Our method can achieve impressive performance in PolSAR image classification with fewer labeled training samples. We have validated its performance by comparing it against several state-of-the-art methods. Experimental results obtained by classifying three PolSAR images demonstrate the efficiency of the proposed method.

Journal Article

Share this book

Add to My Shelf

A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection

by Shi, Zhenwei , Chen, Hao in Algorithms , attention mechanism , Change detection

2020

Remote sensing image change detection (CD) is done to identify desired significant changes between bitemporal images. Given two co-registered images taken at different times, the illumination variations and misregistration errors overwhelm the real object changes. Exploring the relationships among different spatial–temporal pixels may improve the performances of CD methods. In our work, we propose a novel Siamese-based spatial–temporal attention neural network. In contrast to previous methods that separately encode the bitemporal images without referring to any useful spatial–temporal dependency, we design a CD self-attention mechanism to model the spatial–temporal relationships. We integrate a new CD self-attention module in the procedure of feature extraction. Our self-attention module calculates the attention weights between any two pixels at different times and positions and uses them to generate more discriminative features. Considering that the object may have different scales, we partition the image into multi-scale subregions and introduce the self-attention in each subregion. In this way, we could capture spatial–temporal dependencies at various scales, thereby generating better representations to accommodate objects of various sizes. We also introduce a CD dataset LEVIR-CD, which is two orders of magnitude larger than other public datasets of this field. LEVIR-CD consists of a large set of bitemporal Google Earth images, with 637 image pairs (1024 × 1024) and over 31 k independently labeled change instances. Our proposed attention module improves the F1-score of our baseline model from 83.9 to 87.3 with acceptable computational overhead. Experimental results on a public remote sensing image CD dataset show our method outperforms several other state-of-the-art methods.

Journal Article

Share this book

Add to My Shelf

Deep Convolutional Neural Network for Flood Extent Mapping Using Unmanned Aerial Vehicles Data

by Hashemi-Beni, Leila , Thompson, Gary , Langan, Thomas E. in convolutional neural networks , floodplain mapping , fully convolutional network

2019

Flooding is one of the leading threats of natural disasters to human life and property, especially in densely populated urban areas. Rapid and precise extraction of the flooded areas is key to supporting emergency-response planning and providing damage assessment in both spatial and temporal measurements. Unmanned Aerial Vehicles (UAV) technology has recently been recognized as an efficient photogrammetry data acquisition platform to quickly deliver high-resolution imagery because of its cost-effectiveness, ability to fly at lower altitudes, and ability to enter a hazardous area. Different image classification methods including SVM (Support Vector Machine) have been used for flood extent mapping. In recent years, there has been a significant improvement in remote sensing image classification using Convolutional Neural Networks (CNNs). CNNs have demonstrated excellent performance on various tasks including image classification, feature extraction, and segmentation. CNNs can learn features automatically from large datasets through the organization of multi-layers of neurons and have the ability to implement nonlinear decision functions. This study investigates the potential of CNN approaches to extract flooded areas from UAV imagery. A VGG-based fully convolutional network (FCN-16s) was used in this research. The model was fine-tuned and a k-fold cross-validation was applied to estimate the performance of the model on the new UAV imagery dataset. This approach allowed FCN-16s to be trained on the datasets that contained only one hundred training samples, and resulted in a highly accurate classification. Confusion matrix was calculated to estimate the accuracy of the proposed method. The image segmentation results obtained from FCN-16s were compared from the results obtained from FCN-8s, FCN-32s and SVMs. Experimental results showed that the FCNs could extract flooded areas precisely from UAV images compared to the traditional classifiers such as SVMs. The classification accuracy achieved by FCN-16s, FCN-8s, FCN-32s, and SVM for the water class was 97.52%, 97.8%, 94.20% and 89%, respectively.

Journal Article

Share this book

Add to My Shelf

Enhanced Millimeter-Wave 3-D Imaging via Complex-Valued Fully Convolutional Neural Network

by Cui, Xiaoxi , Miao, Ke , Jing, Handan in Algorithms , Artificial neural networks , Back propagation networks

2022

To solve the problems of high computational complexity and unstable image quality inherent in the compressive sensing (CS) method, we propose a complex-valued fully convolutional neural network (CVFCNN)-based method for near-field enhanced millimeter-wave (MMW) three-dimensional (3-D) imaging. A generalized form of the complex parametric rectified linear unit (CPReLU) activation function with independent and learnable parameters is presented to improve the performance of CVFCNN. The CVFCNN structure is designed, and the formulas of the complex-valued back-propagation algorithm are derived in detail, in response to the lack of a machine learning library for a complex-valued neural network (CVNN). Compared with a real-valued fully convolutional neural network (RVFCNN), the proposed CVFCNN offers better performance while needing fewer parameters. In addition, it outperforms the CVFCNN that was used in radar imaging with different activation functions. Numerical simulations and experiments are provided to verify the efficacy of the proposed network, in comparison with state-of-the-art networks and the CS method for enhanced MMW imaging.

Journal Article

Share this book

Add to My Shelf

Automated cardiovascular magnetic resonance image analysis with fully convolutional networks

by Tarroni, Giacomo , Sanghvi, Mihir M. , Kainz, Bernhard in Aged , Algorithms , Angiology

2018

Background Cardiovascular resonance (CMR) imaging is a standard imaging modality for assessing cardiovascular diseases (CVDs), the leading cause of death globally. CMR enables accurate quantification of the cardiac chamber volume, ejection fraction and myocardial mass, providing information for diagnosis and monitoring of CVDs. However, for years, clinicians have been relying on manual approaches for CMR image analysis, which is time consuming and prone to subjective errors. It is a major clinical challenge to automatically derive quantitative and clinically relevant information from CMR images. Methods Deep neural networks have shown a great potential in image pattern recognition and segmentation for a variety of tasks. Here we demonstrate an automated analysis method for CMR images, which is based on a fully convolutional network (FCN). The network is trained and evaluated on a large-scale dataset from the UK Biobank, consisting of 4,875 subjects with 93,500 pixelwise annotated images. The performance of the method has been evaluated using a number of technical metrics, including the Dice metric, mean contour distance and Hausdorff distance, as well as clinically relevant measures, including left ventricle (LV) end-diastolic volume (LVEDV) and end-systolic volume (LVESV), LV mass (LVM); right ventricle (RV) end-diastolic volume (RVEDV) and end-systolic volume (RVESV). Results By combining FCN with a large-scale annotated dataset, the proposed automated method achieves a high performance in segmenting the LV and RV on short-axis CMR images and the left atrium (LA) and right atrium (RA) on long-axis CMR images. On a short-axis image test set of 600 subjects, it achieves an average Dice metric of 0.94 for the LV cavity, 0.88 for the LV myocardium and 0.90 for the RV cavity. The mean absolute difference between automated measurement and manual measurement is 6.1 mL for LVEDV, 5.3 mL for LVESV, 6.9 gram for LVM, 8.5 mL for RVEDV and 7.2 mL for RVESV. On long-axis image test sets, the average Dice metric is 0.93 for the LA cavity (2-chamber view), 0.95 for the LA cavity (4-chamber view) and 0.96 for the RA cavity (4-chamber view). The performance is comparable to human inter-observer variability. Conclusions We show that an automated method achieves a performance on par with human experts in analysing CMR images and deriving clinically relevant measures.

Journal Article

Share this book

Add to My Shelf

SDFCNv2: An Improved FCN Framework for Remote Sensing Images Semantic Segmentation

by Guo, Beibei , Tan, Xiaoliang , Zhang, Xiaodong in Accuracy , Algorithms , Computer vision

2021

Semantic segmentation is a fundamental task in remote sensing image analysis (RSIA). Fully convolutional networks (FCNs) have achieved state-of-the-art performance in the task of semantic segmentation of natural scene images. However, due to distinctive differences between natural scene images and remotely-sensed (RS) images, FCN-based semantic segmentation methods from the field of computer vision cannot achieve promising performances on RS images without modifications. In previous work, we proposed an RS image semantic segmentation framework SDFCNv1, combined with a majority voting postprocessing method. Nevertheless, it still has some drawbacks, such as small receptive field and large number of parameters. In this paper, we propose an improved semantic segmentation framework SDFCNv2 based on SDFCNv1, to conduct optimal semantic segmentation on RS images. We first construct a novel FCN model with hybrid basic convolutional (HBC) blocks and spatial-channel-fusion squeeze-and-excitation (SCFSE) modules, which occupies a larger receptive field and fewer network model parameters. We also put forward a data augmentation method based on spectral-specific stochastic-gamma-transform-based (SSSGT-based) during the model training process to improve generalizability of our model. Besides, we design a mask-weighted voting decision fusion postprocessing algorithm for image segmentation on overlarge RS images. We conducted several comparative experiments on two public datasets and a real surveying and mapping dataset. Extensive experimental results demonstrate that compared with the SDFCNv1 framework, our SDFCNv2 framework can increase the mIoU metric by up to 5.22% while only using about half of parameters.

Journal Article

Share this book

Add to My Shelf

Ship Detection in Gaofen-3 SAR Images Based on Sea Clutter Distribution Analysis and Deep Convolutional Neural Network

by An, Quanzhi , Pan, Zongxu , You, Hongjian in deep convolutional neural network , fully convolutional network , Gaofen-3

2018

Target detection is one of the important applications in the field of remote sensing. The Gaofen-3 (GF-3) Synthetic Aperture Radar (SAR) satellite launched by China is a powerful tool for maritime monitoring. This work aims at detecting ships in GF-3 SAR images using a new land masking strategy, the appropriate model for sea clutter and a neural network as the discrimination scheme. Firstly, the fully convolutional network (FCN) is applied to separate the sea from the land. Then, by analyzing the sea clutter distribution in GF-3 SAR images, we choose the probability distribution model of Constant False Alarm Rate (CFAR) detector from K-distribution, Gamma distribution and Rayleigh distribution based on a tradeoff between the sea clutter modeling accuracy and the computational complexity. Furthermore, in order to better implement CFAR detection, we also use truncated statistic (TS) as a preprocessing scheme and iterative censoring scheme (ICS) for boosting the performance of detector. Finally, we employ a neural network to re-examine the results as the discrimination stage. Experiment results on three GF-3 SAR images verify the effectiveness and efficiency of this approach.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter