Catalogue Search | MBRL

Damaged Building Extraction Using Modified Mask R-CNN Model Using Post-Event Aerial Images of the 2016 Kumamoto Earthquake

by Maruyama, Yoshihisa , Liu, Wen , Zhan, Yihao in Accuracy , aerial images , Algorithms

2022

Remote sensing is an effective method of evaluating building damage after a large-scale natural disaster, such as an earthquake or a typhoon. In recent years, with the development of computer vision technology, deep learning algorithms have been used for damage assessment from aerial images. In April 2016, a series of earthquakes hit the Kyushu region, Japan, and caused severe damage in the Kumamoto and Oita Prefectures. Numerous buildings collapsed because of the strong and continuous shaking. In this study, a deep learning model called Mask R-CNN was modified to extract residential buildings and estimate their damage levels from post-event aerial images. Our Mask R-CNN model employs an improved feature pyramid network and online hard example mining. Furthermore, a non-maximum suppression algorithm across multiple classes was also applied to improve prediction. The aerial images captured on 29 April 2016 (two weeks after the main shock) in Mashiki Town, Kumamoto Prefecture, were used as the training and test sets. Compared with the field survey results, our model achieved approximately 95% accuracy for building extraction and over 92% accuracy for the detection of severely damaged buildings. The overall classification accuracy for the four damage classes was approximately 88%, demonstrating acceptable performance.

Journal Article

Share this book

Add to My Shelf

Village Building Identification Based on Ensemble Convolutional Neural Networks

by Chen, Qi , Xu, Yongwei , Shibasaki, Ryosuke in building detection , Ensemble Convolutional Neural Networks , Identification

2017

In this study, we present the Ensemble Convolutional Neural Network (ECNN), an elaborate CNN frame formulated based on ensembling state-of-the-art CNN models, to identify village buildings from open high-resolution remote sensing (HRRS) images. First, to optimize and mine the capability of CNN for village mapping and to ensure compatibility with our classification targets, a few state-of-the-art models were carefully optimized and enhanced based on a series of rigorous analyses and evaluations. Second, rather than directly implementing building identification by using these models, we exploited most of their advantages by ensembling their feature extractor parts into a stronger model called ECNN based on the multiscale feature learning method. Finally, the generated ECNN was applied to a pixel-level classification frame to implement object identification. The proposed method can serve as a viable tool for village building identification with high accuracy and efficiency. The experimental results obtained from the test area in Savannakhet province, Laos, prove that the proposed ECNN model significantly outperforms existing methods, improving overall accuracy from 96.64% to 99.26%, and kappa from 0.57 to 0.86.

Journal Article

Share this book

Add to My Shelf

Civil Infrastructure Damage and Corrosion Detection: An Application of Machine Learning

by Munawar, Hafiz , Qayyum, Siddra , Akram, Junaid in Accuracy , Artificial intelligence , Artificial neural networks

2022

Automatic detection of corrosion and associated damages to civil infrastructures such as bridges, buildings, and roads, from aerial images captured by an Unmanned Aerial Vehicle (UAV), helps one to overcome the challenges and shortcomings (objectivity and reliability) associated with the manual inspection methods. Deep learning methods have been widely reported in the literature for civil infrastructure corrosion detection. Among them, convolutional neural networks (CNNs) display promising applicability for the automatic detection of image features less affected by image noises. Therefore, in the current study, we propose a modified version of deep hierarchical CNN architecture, based on 16 convolution layers and cycle generative adversarial network (CycleGAN), to predict pixel-wise segmentation in an end-to-end manner using the images of Bolte Bridge and sky rail areas in Victoria (Melbourne). The convolutedly designed model network proposed in the study is based on learning and aggregation of multi-scale and multilevel features while moving from the low convolutional layers to the high-level layers, thus reducing the consistency loss in images due to the inclusion of CycleGAN. The standard approaches only use the last convolutional layer, but our proposed architecture differs from these approaches and uses multiple layers. Moreover, we have used guided filtering and Conditional Random Fields (CRFs) methods to refine the prediction results. Additionally, the effectiveness of the proposed architecture was assessed using benchmarking data of 600 images of civil infrastructure. Overall, the results show that the deep hierarchical CNN architecture based on 16 convolution layers produced advanced performances when evaluated for different methods, including the baseline, PSPNet, DeepLab, and SegNet. Overall, the extended method displayed the Global Accuracy (GA); Class Average Accuracy (CAC); mean Intersection Of the Union (IOU); Precision (P); Recall (R); and F-score values of 0.989, 0.931, 0.878, 0.849, 0.818 and 0.833, respectively.

Journal Article

Share this book

Add to My Shelf

Automatic Building Segmentation of Aerial Imagery Using Multi-Constraint Fully Convolutional Networks

by Xu, Yongwei , Shibasaki, Ryosuke , Guo, Zhiling in aerial imagery , building detection , convolutional neural network

2018

Automatic building segmentation from aerial imagery is an important and challenging task because of the variety of backgrounds, building textures and imaging conditions. Currently, research using variant types of fully convolutional networks (FCNs) has largely improved the performance of this task. However, pursuing more accurate segmentation results is still critical for further applications such as automatic mapping. In this study, a multi-constraint fully convolutional network (MC–FCN) model is proposed to perform end-to-end building segmentation. Our MC–FCN model consists of a bottom-up/top-down fully convolutional architecture and multi-constraints that are computed between the binary cross entropy of prediction and the corresponding ground truth. Since more constraints are applied to optimize the parameters of the intermediate layers, the multi-scale feature representation of the model is further enhanced, and hence higher performance can be achieved. The experiments on a very-high-resolution aerial image dataset covering 18 km 2 and more than 17,000 buildings indicate that our method performs well in the building segmentation task. The proposed MC–FCN method significantly outperforms the classic FCN method and the adaptive boosting method using features extracted by the histogram of oriented gradients. Compared with the state-of-the-art U–Net model, MC–FCN gains 3.2% (0.833 vs. 0.807) and 2.2% (0.893 vs. 0.874) relative improvements of Jaccard index and kappa coefficient with the cost of only 1.8% increment of the model-training time. In addition, the sensitivity analysis demonstrates that constraints at different positions have inconsistent impact on the performance of the MC–FCN.

Journal Article

Share this book

Add to My Shelf

Remote Sensing Image-Based Building Change Detection: A Case Study of the Qinling Mountains in China

by Shen, Qiang , Zhao, Keyun , Li, Ying in Algorithms , Annotations , building change detection

2025

With the widespread application of deep learning in Earth observation, remote sensing image-based building change detection has achieved numerous groundbreaking advancements. However, differences across time periods caused by temporal variations in land cover, as well as the complex spatial structures in remote sensing scenes, significantly constrain the performance of change detection. To address these challenges, a change detection algorithm based on spatio-spectral information aggregation is proposed, which consists of two key modules: the Cross-Scale Heterogeneous Convolution module (CSHConv) and the Spatio-Spectral Information Fusion module (SSIF). CSHConv mitigates information loss caused by scale heterogeneity, thereby enhancing the effective utilization of multi-scale features. Meanwhile, SSIF models spatial and spectral information jointly, capturing interactions across different spatial scales and spectral domains. This investigation is illustrated with a case study conducted with the real-world dataset QL-CD (Qinling change detection), acquired in the Qinling region of China. The work includes the construction of QL-CD, which includes 12,724 pairs of images captured by the Gaofen-1 satellite. Experimental results demonstrate that the proposed approach outperforms a wide range of state-of-the-art algorithms.

Journal Article

Share this book

Add to My Shelf

Automatic Building Change Detection on Aerial Images using Convolutional Neural Networks and Handcrafted Features

by Quispe, Diego Alonso Javier , Sulla-Torres, Jose in Algorithms , Artificial neural networks , Buildings

2020

In this article, we present a new framework to solve the task of building change detection, making use of a convolutional neural network (CNN) for the building detection step, and a set of handcrafted features extraction for the change detection. The buildings are extracted using the method called Mask R-CNN which is a neural network used for object- based instance segmentation and has been tested in different case studies to segment different types of objects obtaining good results. The buildings are detected in bitemporal images, where three different comparison metrics MSE, PSNR and SSIM are used to differentiate if there are changes in buildings, we used this metrics in the Hue, Saturation and Brightness representation of the image. Finally the characteristics are classified by two algorithms, Support Vector Machine and Random Forest, so that both results can be compared. The experiments were performed in a large dataset called WHU building dataset, which contains very high-resolution (VHR) aerial images. The results obtained are comparable to those of the state of the art.

Journal Article

Share this book

Add to My Shelf

Method of Building Detection in Optical Remote Sensing Images Based on SegFormer

by Li, Meilin , Li, Qing , Su, Xu in atrous spatial pyramid , building detection , Buildings

2023

An appropriate detection network is required to extract building information in remote sensing images and to relieve the issue of poor detection effects resulting from the deficiency of detailed features. Firstly, we embed a transposed convolution sampling module fusing multiple normalization activation layers in the decoder based on the SegFormer network. This step alleviates the issue of missing feature semantics by adding holes and fillings, cascading multiple normalizations and activation layers to hold back over-fitting regularization expression and guarantee steady feature parameter classification. Secondly, the atrous spatial pyramid pooling decoding module is fused to explore multi-scale contextual information and to overcome issues such as the loss of detailed information on local buildings and the lack of long-distance information. Ablation experiments and comparison experiments are performed on the remote sensing image AISD, MBD, and WHU dataset. The robustness and validity of the improved mechanism are demonstrated by control groups of ablation experiments. In comparative experiments with the HRnet, PSPNet, U-Net, DeepLabv3+ networks, and the original detection algorithm, the mIoU of the AISD, the MBD, and the WHU dataset is enhanced by 17.68%, 30.44%, and 15.26%, respectively. The results of the experiments show that the method of this paper is superior to comparative methods such as U-Net. Furthermore, it is better for integrity detection of building edges and reduces the number of missing and false detections.

Journal Article

Share this book

Add to My Shelf

MAR-YOLO: multi-scale feature adaptive selection and asymptotic pyramid for oriented Building detection in remote sensing images

by Zhao, Yuzhe , Qian, Haizhong in 639/166 , 639/705 , Building detection

2025

Building detection in remote sensing imagery confronts three interdependent challenges including extreme scale variance under dense spatial distributions, orientation instability in off-nadir imagery, and semantic gaps during multi-scale feature fusion. Existing methods address these challenges in isolation, resulting in performance degradation when challenges co-occur. MAR-YOLO establishes an integrated rotated object detection framework extending YOLOv11-OBB through three synergistic innovations. The Multi-scale Feature Adaptive Selection (MFAS) module adaptively filters P2-P5 features through dual- domain weighting, enhancing small building perception while suppressing redundancy. The adapted Adaptive Feature Pyramid Network (AFPN) employs progressive fusion with scale-matched kernels and learned spatial weights, eliminating semantic inconsistencies inherent in direct multi-scale concatenation. The RepVGG-based Enhanced Rotated Detection Head (RRD-Head) applies branch-specialized structural reparameterization addressing angle regression instability. Validation on BONAI demonstrates 87.2% mAP50 and 65.3% mAP50-95, representing 2.9% and 2.6% improvements over YOLOv11s-OBB at 95 FPS. Cross-dataset experiments on DOTA, DIOR-R, and HRSC2016 confirm architectural robustness across diverse detection scenarios.

Journal Article

Share this book

Add to My Shelf

Stacked Autoencoders Driven by Semi-Supervised Learning for Building Extraction from near Infrared Remote Sensing Imagery

by Doulamis, Anastasios , Maltezos, Evangelos , Protopapadakis, Eftychios in building detection , color , data collection

2021

In this paper, we propose a Stack Auto-encoder (SAE)-Driven and Semi-Supervised (SSL)-Based Deep Neural Network (DNN) to extract buildings from relatively low-cost satellite near infrared images. The novelty of our scheme is that we employ only an extremely small portion of labeled data for training the deep model which constitutes less than 0.08% of the total data. This way, we significantly reduce the manual effort needed to complete an annotation process, and thus the time required for creating a reliable labeled dataset. On the contrary, we apply novel semi-supervised techniques to estimate soft labels (targets) of the vast amount of existing unlabeled data and then we utilize these soft estimates to improve model training. Overall, four SSL schemes are employed, the Anchor Graph, the Safe Semi-Supervised Regression (SAFER), the Squared-loss Mutual Information Regularization (SMIR), and an equal importance Weighted Average of them (WeiAve). To retain only the most meaning information of the input data, labeled and unlabeled ones, we also employ a Stack Autoencoder (SAE) trained under an unsupervised manner. This way, we handle noise in the input signals, attributed to dimensionality redundancy, without sacrificing meaningful information. Experimental results on the benchmarked dataset of Vaihingen city in Germany indicate that our approach outperforms all state-of-the-art methods in the field using the same type of color orthoimages, though the fact that a limited dataset is utilized (10 times less data or better, compared to other approaches), while our performance is close to the one achieved by high expensive and much more precise input information like the one derived from Light Detection and Ranging (LiDAR) sensors. In addition, the proposed approach can be easily expanded to handle any number of classes, including buildings, vegetation, and ground.

Journal Article

Share this book

Add to My Shelf

2D Image-To-3D Model: Knowledge-Based 3D Building Reconstruction (3DBR) Using Single Aerial Images and Convolutional Neural Networks (CNNs)

by Alidoost, Fatemeh , Arefi, Hossein , Tombari, Federico in aerial photography , Algorithms , Artificial neural networks

2019

In this study, a deep learning (DL)-based approach is proposed for the detection and reconstruction of buildings from a single aerial image. The pre-required knowledge to reconstruct the 3D shapes of buildings, including the height data as well as the linear elements of individual roofs, is derived from the RGB image using an optimized multi-scale convolutional–deconvolutional network (MSCDN). The proposed network is composed of two feature extraction levels to first predict the coarse features, and then automatically refine them. The predicted features include the normalized digital surface models (nDSMs) and linear elements of roofs in three classes of eave, ridge, and hip lines. Then, the prismatic models of buildings are generated by analyzing the eave lines. The parametric models of individual roofs are also reconstructed using the predicted ridge and hip lines. The experiments show that, even in the presence of noises in height values, the proposed method performs well on 3D reconstruction of buildings with different shapes and complexities. The average root mean square error (RMSE) and normalized median absolute deviation (NMAD) metrics are about 3.43 m and 1.13 m, respectively for the predicted nDSM. Moreover, the quality of the extracted linear elements is about 91.31% and 83.69% for the Potsdam and Zeebrugge test data, respectively. Unlike the state-of-the-art methods, the proposed approach does not need any additional or auxiliary data and employs a single image to reconstruct the 3D models of buildings with the competitive precision of about 1.2 m and 0.8 m for the horizontal and vertical RMSEs over the Potsdam data and about 3.9 m and 2.4 m over the Zeebrugge test data.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter