Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
55
result(s) for
"fully convolutional network (FCN)"
Sort by:
Classification for High Resolution Remote Sensing Imagery Using a Fully Convolutional Network
2017
As a variant of Convolutional Neural Networks (CNNs) in Deep Learning, the Fully Convolutional Network (FCN) model achieved state-of-the-art performance for natural image semantic segmentation. In this paper, an accurate classification approach for high resolution remote sensing imagery based on the improved FCN model is proposed. Firstly, we improve the density of output class maps by introducing Atrous convolution, and secondly, we design a multi-scale network architecture by adding a skip-layer structure to make it capable for multi-resolution image classification. Finally, we further refine the output class map using Conditional Random Fields (CRFs) post-processing. Our classification model is trained on 70 GF-2 true color images, and tested on the other 4 GF-2 images and 3 IKONOS true color images. We also employ object-oriented classification, patch-based CNN classification, and the FCN-8s approach on the same images for comparison. The experiments show that compared with the existing approaches, our approach has an obvious improvement in accuracy. The average precision, recall, and Kappa coefficient of our approach are 0.81, 0.78, and 0.83, respectively. The experiments also prove that our approach has strong applicability for multi-resolution image classification.
Journal Article
A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection
2020
Remote sensing image change detection (CD) is done to identify desired significant changes between bitemporal images. Given two co-registered images taken at different times, the illumination variations and misregistration errors overwhelm the real object changes. Exploring the relationships among different spatial–temporal pixels may improve the performances of CD methods. In our work, we propose a novel Siamese-based spatial–temporal attention neural network. In contrast to previous methods that separately encode the bitemporal images without referring to any useful spatial–temporal dependency, we design a CD self-attention mechanism to model the spatial–temporal relationships. We integrate a new CD self-attention module in the procedure of feature extraction. Our self-attention module calculates the attention weights between any two pixels at different times and positions and uses them to generate more discriminative features. Considering that the object may have different scales, we partition the image into multi-scale subregions and introduce the self-attention in each subregion. In this way, we could capture spatial–temporal dependencies at various scales, thereby generating better representations to accommodate objects of various sizes. We also introduce a CD dataset LEVIR-CD, which is two orders of magnitude larger than other public datasets of this field. LEVIR-CD consists of a large set of bitemporal Google Earth images, with 637 image pairs (1024 × 1024) and over 31 k independently labeled change instances. Our proposed attention module improves the F1-score of our baseline model from 83.9 to 87.3 with acceptable computational overhead. Experimental results on a public remote sensing image CD dataset show our method outperforms several other state-of-the-art methods.
Journal Article
FCN attention enhancing asphalt pavement crack detection through attention mechanisms and fully convolutional networks
2025
This paper presents an innovative approach to detecting cracks in asphalt pavement using an FCN-attention model, which integrates attention mechanisms into a fully convolutional network (FCN) for enhanced pixel-level segmentation. The model employs a ResNet-50-based encoder and incorporates channel-wise and spatial attention modules to refine feature extraction and focus on the most relevant image regions. The results demonstrate that the FCN-attention model outperforms traditional models such as VGG-16, AlexNet, MobileNet, and GoogleNet across multiple evaluation metrics. Specifically, the FCN-attention model achieves a global accuracy rate of 90.79%, with a precision of 92.3%, recall of 89.5%, and an F1-score of 90.9%. Additionally, the model achieves an average intersection-over-union (IoU) ratio of 69.7% and a test duration of 109.1 ms per image. The proposed method also excels in crack length and width calculation, providing real-world dimensions for the detected cracks. The model’s effectiveness is further validated through an ablation study, which highlights the significant impact of the attention mechanism on model performance.
Journal Article
Advanced Global Prototypical Segmentation Framework for Few-Shot Hyperspectral Image Classification
by
Xia, Mengen
,
Zhou, Hao
,
Xia, Kunming
in
Classification
,
contrastive learning (CL)
,
Deep learning
2024
With the advancement of deep learning, related networks have shown strong performance for Hyperspectral Image (HSI) classification. However, these methods face two main challenges in HSI classification: (1) the inability to capture global information of HSI due to the restriction of patch input and (2) insufficient utilization of information from limited labeled samples. To overcome these challenges, we propose an Advanced Global Prototypical Segmentation (AGPS) framework. Within the AGPS framework, we design a patch-free feature extractor segmentation network (SegNet) based on a fully convolutional network (FCN), which processes the entire HSI to capture global information. To enrich the global information extracted by SegNet, we propose a Fusion of Lateral Connection (FLC) structure that fuses the low-level detailed features of the encoder output with the high-level features of the decoder output. Additionally, we propose an Atrous Spatial Pyramid Pooling-Position Attention (ASPP-PA) module to capture multi-scale spatial positional information. Finally, to explore more valuable information from limited labeled samples, we propose an advanced global prototypical representation learning strategy. Building upon the dual constraints of the global prototypical representation learning strategy, we introduce supervised contrastive learning (CL), which optimizes our network with three different constraints. The experimental results of three public datasets demonstrate that our method outperforms the existing state-of-the-art methods.
Journal Article
Time Series Classification with InceptionFCN
by
Ibrokhimov, Bunyodbek
,
Baydadaev, Shokhrukh
,
Kwon, Jangwoo
in
Algorithms
,
Archives & records
,
Benchmarking
2021
Deep neural networks (DNN) have proven to be efficient in computer vision and data classification with an increasing number of successful applications. Time series classification (TSC) has been one of the challenging problems in data mining in the last decade, and significant research has been proposed with various solutions, including algorithm-based approaches as well as machine and deep learning approaches. This paper focuses on combining the two well-known deep learning techniques, namely the Inception module and the Fully Convolutional Network. The proposed method proved to be more efficient than the previous state-of-the-art InceptionTime method. We tested our model on the univariate TSC benchmark (the UCR/UEA archive), which includes 85 time-series datasets, and proved that our network outperforms the InceptionTime in terms of the training time and overall accuracy on the UCR archive.
Journal Article
A Novel Deep Fully Convolutional Network for PolSAR Image Classification
by
Chen, Yanqiao
,
Liu, Guangyuan
,
Jiao, Licheng
in
Artificial intelligence
,
Artificial neural networks
,
Classification
2018
Polarimetric synthetic aperture radar (PolSAR) image classification has become more and more popular in recent years. As we all know, PolSAR image classification is actually a dense prediction problem. Fortunately, the recently proposed fully convolutional network (FCN) model can be used to solve the dense prediction problem, which means that FCN has great potential in PolSAR image classification. However, there are some problems to be solved in PolSAR image classification by FCN. Therefore, we propose sliding window fully convolutional network and sparse coding (SFCN-SC) for PolSAR image classification. The merit of our method is twofold: (1) Compared with convolutional neural network (CNN), SFCN-SC can avoid repeated calculation and memory occupation; (2) Sparse coding is used to reduce the computation burden and memory occupation, and meanwhile the image integrity can be maintained in the maximum extent. We use three PolSAR images to test the performance of SFCN-SC. Compared with several state-of-the-art methods, SFCN-SC achieves promising results in PolSAR image classification.
Journal Article
Class-Wise Fully Convolutional Network for Semantic Segmentation of Remote Sensing Images
2021
Semantic segmentation is a fundamental task in remote sensing image interpretation, which aims to assign a semantic label for every pixel in the given image. Accurate semantic segmentation is still challenging due to the complex distributions of various ground objects. With the development of deep learning, a series of segmentation networks represented by fully convolutional network (FCN) has made remarkable progress on this problem, but the segmentation accuracy is still far from expectations. This paper focuses on the importance of class-specific features of different land cover objects, and presents a novel end-to-end class-wise processing framework for segmentation. The proposed class-wise FCN (C-FCN) is shaped in the form of an encoder-decoder structure with skip-connections, in which the encoder is shared to produce general features for all categories and the decoder is class-wise to process class-specific features. To be detailed, class-wise transition (CT), class-wise up-sampling (CU), class-wise supervision (CS), and class-wise classification (CC) modules are designed to achieve the class-wise transfer, recover the resolution of class-wise feature maps, bridge the encoder and modified decoder, and implement class-wise classifications, respectively. Class-wise and group convolutions are adopted in the architecture with regard to the control of parameter numbers. The method is tested on the public ISPRS 2D semantic labeling benchmark datasets. Experimental results show that the proposed C-FCN significantly improves the segmentation performances compared with many state-of-the-art FCN-based networks, revealing its potentials on accurate segmentation of complex remote sensing images.
Journal Article
Analysis of depth variation of U-NET architecture for brain tumor segmentation
by
Jena, Biswajit
,
Nayak, Gopal Krishna
,
Saxena, Sanjay
in
Brain cancer
,
Complexity
,
Computer Communication Networks
2023
U-NET is a fully convolutional network (FCN) architecture designed to research the segmentation of biomedical images. The depth of the U-NET is one of the major constraints of this model while computing the performances. The larger depth of the U-NET means that its computational complexity is high as well. In certain cases, this large depth, as in the original model, is not justified for biomedical imaging modalities. In this paper, we have done an efficient analysis of U-NET architecture’s depth variation, i.e., after removing different layers. For the analysis, the datasets BraTS-2017 and BraTS-2019, which consist of High-Grade Glioma (HGG) and Low-Grade Glioma (LGG) MR Scans, have been used for tumor segmentation. We have achieved a dice coefficient of at least 0.8866 and as high as 0.8887 on the discovery cohort, and at least 0.8895 and as high as 0.8911 cross-validation replication cohort. The results show that there are the least significant changes occurring in the performance parameters while moving from the higher to the lower depth of the model. Hence, in this paper, we presented that the large depth of U-NET, which costs more in terms of computational complexity, is not always required. Moreover, the U-NET models with depth reduction, which decreases the computational complexity, can achieve nearly the same results as in the case of the original U-NET.
Journal Article
Integration of Deep Q-Learning with a Grasp Quality Network for Robot Grasping in Cluttered Environments
2024
During the movement of a robotic arm, collisions can easily occur if the arm directly grasps at multiple tightly stacked objects, thereby leading to grasp failures or machine damage. Grasp success can be improved through the rearrangement or movement of objects to clear space for grasping. This paper presents a high-performance deep Q-learning framework that can help robotic arms to learn synchronized push and grasp tasks. In this framework, a grasp quality network is used for precisely identifying stable grasp positions on objects to expedite model convergence and solve the problem of sparse rewards caused during training because of grasp failures. Furthermore, a novel reward function is proposed for effectively evaluating whether a pushing action is effective. The proposed framework achieved grasp success rates of 92% and 89% in simulations and real-world experiments, respectively. Furthermore, only 200 training steps were required to achieve a grasp success rate of 80%, which indicates the suitability of the proposed framework for rapid deployment in industrial settings.
Journal Article
Deep Learning for Land Cover Change Detection
by
Riese, Felix M.
,
Keller, Sina
,
Sefrin, Oliver
in
artificial intelligence
,
Artificial neural networks
,
Change detection
2021
Land cover and its change are crucial for many environmental applications. This study focuses on the land cover classification and change detection with multitemporal and multispectral Sentinel-2 satellite data. To address the challenging land cover change detection task, we rely on two different deep learning architectures and selected pre-processing steps. For example, we define an excluded class and deal with temporal water shoreline changes in the pre-processing. We employ a fully convolutional neural network (FCN), and we combine the FCN with long short-term memory (LSTM) networks. The FCN can only handle monotemporal input data, while the FCN combined with LSTM can use sequential information (multitemporal). Besides, we provided fixed and variable sequences as training sequences for the combined FCN and LSTM approach. The former refers to using six defined satellite images, while the latter consists of image sequences from an extended training pool of ten images. Further, we propose measures for the robustness concerning the selection of Sentinel-2 image data as evaluation metrics. We can distinguish between actual land cover changes and misclassifications of the deep learning approaches with these metrics. According to the provided metrics, both multitemporal LSTM approaches outperform the monotemporal FCN approach, about 3 to 5 percentage points (p.p.). The LSTM approach trained on the variable sequences detects 3 p.p. more land cover changes than the LSTM approach trained on the fixed sequences. Besides, applying our selected pre-processing improves the water classification and avoids reducing the dataset effectively by 17.6%. The presented LSTM approaches can be modified to provide applicability for a variable number of image sequences since we published the code of the deep learning models. The Sentinel-2 data and the ground truth are also freely available.
Journal Article