Catalogue Search | MBRL

SSCNet: A Spectrum-Space Collaborative Network for Semantic Segmentation of Remote Sensing Images

by Xu, Feng , Yong, Xi , Chen, Deqing in Ablation , Artificial intelligence , branches

2023

Semantic segmentation plays a pivotal role in the intelligent interpretation of remote sensing images (RSIs). However, conventional methods predominantly focus on learning representations within the spatial domain, often resulting in suboptimal discriminative capabilities. Given the intrinsic spectral characteristics of RSIs, it becomes imperative to enhance the discriminative potential of these representations by integrating spectral context alongside spatial information. In this paper, we introduce the spectrum-space collaborative network (SSCNet), which is designed to capture both spectral and spatial dependencies, thereby elevating the quality of semantic segmentation in RSIs. Our innovative approach features a joint spectral–spatial attention module (JSSA) that concurrently employs spectral attention (SpeA) and spatial attention (SpaA). Instead of feature-level aggregation, we propose the fusion of attention maps to gather spectral and spatial contexts from their respective branches. Within SpeA, we calculate the position-wise spectral similarity using the complex spectral Euclidean distance (CSED) of the real and imaginary components of projected feature maps in the frequency domain. To comprehensively calculate both spectral and spatial losses, we introduce edge loss, Dice loss, and cross-entropy loss, subsequently merging them with appropriate weighting. Extensive experiments on the ISPRS Potsdam and LoveDA datasets underscore SSCNet’s superior performance compared with several state-of-the-art methods. Furthermore, an ablation study confirms the efficacy of SpeA.

Journal Article

Share this book

Add to My Shelf

A Spatial–Frequency Combined Transformer for Cloud Removal of Optical Remote Sensing Images

by Wu, Caifeng , Ding, Chenlong , Xia, Runliang in Attention , cloud removal , Clouds

2025

Cloud removal is a vital preprocessing step in optical remote sensing images (RSIs), directly enhancing image quality and providing a high-quality data foundation for downstream tasks, such as water body extraction and land cover classification. Existing methods attempt to combine spatial and frequency features for cloud removal, but they rely on shallow feature concatenation or simplistic addition operations, which fail to establish effective cross-domain synergistic mechanisms. These approaches lead to edge blurring and noticeable color distortions. To address this issue, we propose a spatial–frequency collaborative enhancement Transformer network named SFCRFormer, which significantly improves cloud removal performance. The core of SFCRFormer is the spatial–frequency combined Transformer (SFCT) block, which implements cross-domain feature reinforcement through a dual-branch spatial attention (DBSA) module and frequency self-attention (FreSA) module to effectively capture global context information. The DBSA module enhances the representation of spatial features by decoupling spatial-channel dependencies via parallelized feature refinement paths, surpassing the performance of traditional single-branch attention mechanisms in maintaining the overall structure of the image. FreSA leverages fast Fourier transform to convert features into the frequency domain, using frequency differences between object and cloud regions to achieve precise cloud detection and fine-grained removal. In order to further enhance the features extracted by DBSA and FreSA, we design the dual-domain feed-forward network (DDFFN), which effectively improves the detail fidelity of the restored image by multi-scale convolution for local refinement and frequency transformation for global structural optimization. A composite loss function, incorporating Charbonnier loss and Structural Similarity Index (SSIM) loss, is employed to optimize model training and balance pixel-level accuracy with structural fidelity. Experimental evaluations on the public datasets demonstrate that SFCRFormer outperforms state-of-the-art methods across various quantitative metrics, including PSNR and SSIM, while delivering superior visual results.

Journal Article

Share this book

Add to My Shelf

GPRNet: A Geometric Prior-Refined Semantic Segmentation Network for Land Use and Land Cover Mapping

by Xu, Zhennan , Sun, Jiahao , Li, Zhuozheng in Accuracy , Alignment , Architecture

2025

What are the main findings? * We propose GPRNet, a geometry-aware semantic segmentation framework that integrates a Geometric Prior-Refined Block (GPRB) and a Mutual Calibrated Fusion Module (MCFM) to enhance boundary sensitivity and cross-stage semantic consistency. * GPRB leverages learnable directional derivatives to construct structure-aware strength and orientation maps, enabling more accurate spatial localization in complex scenes. * MCFM introduces geometric alignment and semantic enhancement mechanisms that effectively reduce the encoder–decoder feature gap. * GPRNet achieves consistent performance gains on ISPRS Potsdam and LoveDA, improving mIoU by up to 1.7% and 1.3% respectively over strong CNN-, attention-, and transformer-based baselines. We propose GPRNet, a geometry-aware semantic segmentation framework that integrates a Geometric Prior-Refined Block (GPRB) and a Mutual Calibrated Fusion Module (MCFM) to enhance boundary sensitivity and cross-stage semantic consistency. GPRB leverages learnable directional derivatives to construct structure-aware strength and orientation maps, enabling more accurate spatial localization in complex scenes. MCFM introduces geometric alignment and semantic enhancement mechanisms that effectively reduce the encoder–decoder feature gap. GPRNet achieves consistent performance gains on ISPRS Potsdam and LoveDA, improving mIoU by up to 1.7% and 1.3% respectively over strong CNN-, attention-, and transformer-based baselines. What are the implications of the main findings? * Incorporating geometric priors through learnable gradient-based features improves the model’s ability to capture structural patterns and preserve fine boundaries in high-resolution remote sensing imagery. * The mutual calibration mechanism demonstrates an effective design for encoder–decoder interaction, showing potential for broader applicability across segmentation architectures and modalities. * The empirical evidence indicates that geometry-informed representation learning can serve as a general principle for enhancing land-cover mapping in diverse and structurally complex environments. Incorporating geometric priors through learnable gradient-based features improves the model’s ability to capture structural patterns and preserve fine boundaries in high-resolution remote sensing imagery. The mutual calibration mechanism demonstrates an effective design for encoder–decoder interaction, showing potential for broader applicability across segmentation architectures and modalities. The empirical evidence indicates that geometry-informed representation learning can serve as a general principle for enhancing land-cover mapping in diverse and structurally complex environments. Semantic segmentation of high-resolution remote sensing images remains a challenging task due to the intricate spatial structures, scale variability, and semantic ambiguity among ground objects. Moreover, the reliable delineation of fine-grained boundaries continues to impose difficulties on existing CNN- and transformer-based models, particularly in heterogeneous urban and rural environments. In this study, we propose GPRNet, a novel geometry-aware segmentation framework that leverages geometric priors and cross-stage semantic alignment for more precise land-cover classification. Central to our approach is the Geometric Prior-Refined Block (GPRB), which learns directional derivative filters, initialized with Sobel-like operators, to generate edge-aware strength and orientation maps that explicitly encode structural cues. These maps are used to guide structure-aware attention modulation, enabling refined spatial localization. Additionally, we introduce the Mutual Calibrated Fusion Module (MCFM) to mitigate the semantic gap between encoder and decoder features by incorporating cross-stage geometric alignment and semantic enhancement mechanisms. Extensive experiments conducted on the ISPRS Potsdam and LoveDA datasets validate the effectiveness of the proposed method, with GPRNet achieving improvements of up to 1.7% mIoU on Potsdam and 1.3% mIoU on LoveDA over strong recent baselines. Furthermore, the model maintains competitive inference efficiency, suggesting a favorable balance between accuracy and computational cost. These results demonstrate the promising potential of geometric-prior integration and mutual calibration in advancing semantic segmentation in complex environments.

Journal Article

Share this book

Add to My Shelf

Attentively Learning Edge Distributions for Semantic Segmentation of Remote Sensing Imagery

by Li, Tao , Xia, Runliang , Zhang, Kaiwen in Ablation , Accuracy , Covariance matrix

2022

Semantic segmentation has been a fundamental task in interpreting remote sensing imagery (RSI) for various downstream applications. Due to the high intra-class variants and inter-class similarities, inflexibly transferring natural image-specific networks to RSI is inadvisable. To enhance the distinguishability of learnt representations, attention modules were developed and applied to RSI, resulting in satisfactory improvements. However, these designs capture contextual information by equally handling all the pixels regardless of whether they around edges. Therefore, blurry boundaries are generated, rising high uncertainties in classifying vast adjacent pixels. Hereby, we propose an edge distribution attention module (EDA) to highlight the edge distributions of leant feature maps in a self-attentive fashion. In this module, we first formulate and model column-wise and row-wise edge attention maps based on covariance matrix analysis. Furthermore, a hybrid attention module (HAM) that emphasizes the edge distributions and position-wise dependencies is devised combing with non-local block. Consequently, a conceptually end-to-end neural network, termed as EDENet, is proposed to integrate HAM hierarchically for the detailed strengthening of multi-level representations. EDENet implicitly learns representative and discriminative features, providing available and reasonable cues for dense prediction. The experimental results evaluated on ISPRS Vaihingen, Potsdam and DeepGlobe datasets show the efficacy and superiority to the state-of-the-art methods on overall accuracy (OA) and mean intersection over union (mIoU). In addition, the ablation study further validates the effects of EDA.

Journal Article

Share this book

Add to My Shelf

Hybridizing Cross-Level Contextual and Attentive Representations for Remote Sensing Imagery Semantic Segmentation

by Xu, Feng , Xia, Runliang , Tong, Yao in Ablation , area , Artificial neural networks

2021

Semantic segmentation of remote sensing imagery is a fundamental task in intelligent interpretation. Since deep convolutional neural networks (DCNNs) performed considerable insight in learning implicit representations from data, numerous works in recent years have transferred the DCNN-based model to remote sensing data analysis. However, the wide-range observation areas, complex and diverse objects and illumination and imaging angle influence the pixels easily confused, leading to undesirable results. Therefore, a remote sensing imagery semantic segmentation neural network, named HCANet, is proposed to generate representative and discriminative representations for dense predictions. HCANet hybridizes cross-level contextual and attentive representations to emphasize the distinguishability of learned features. First of all, a cross-level contextual representation module (CCRM) is devised to exploit and harness the superpixel contextual information. Moreover, a hybrid representation enhancement module (HREM) is designed to fuse cross-level contextual and self-attentive representations flexibly. Furthermore, the decoder incorporates DUpsampling operation to boost the efficiency losslessly. The extensive experiments are implemented on the Vaihingen and Potsdam benchmarks. In addition, the results indicate that HCANet achieves excellent performance on overall accuracy and mean intersection over union. In addition, the ablation study further verifies the superiority of CCRM.

Journal Article

Share this book

Add to My Shelf

Encoding Contextual Information by Interlacing Transformer and Convolution for Remote Sensing Imagery Semantic Segmentation

by Xu, Zhennan , Xu, Feng , Li, Tao in Ablation , Agglomeration , Benchmarks

2022

Contextual information plays a pivotal role in the semantic segmentation of remote sensing imagery (RSI) due to the imbalanced distributions and ubiquitous intra-class variants. The emergence of the transformer intrigues the revolution of vision tasks with its impressive scalability in establishing long-range dependencies. However, the local patterns, such as inherent structures and spatial details, are broken with the tokenization of the transformer. Therefore, the ICTNet is devised to confront the deficiencies mentioned above. Principally, ICTNet inherits the encoder–decoder architecture. First of all, Swin Transformer blocks (STBs) and convolution blocks (CBs) are deployed and interlaced, accompanied by encoded feature aggregation modules (EFAs) in the encoder stage. This design allows the network to learn the local patterns and distant dependencies and their interactions simultaneously. Moreover, multiple DUpsamplings (DUPs) followed by decoded feature aggregation modules (DFAs) form the decoder of ICTNet. Specifically, the transformation and upsampling loss are shrunken while recovering features. Together with the devised encoder and decoder, the well-rounded context is captured and contributes to the inference most. Extensive experiments are conducted on the ISPRS Vaihingen, Potsdam and DeepGlobe benchmarks. Quantitative and qualitative evaluations exhibit the competitive performance of ICTNet compared to mainstream and state-of-the-art methods. Additionally, the ablation study of DFA and DUP is implemented to validate the effects.

Journal Article

Share this book

Add to My Shelf

CCAM: China Catchment Attributes and Meteorology dataset

by Liu, Qixing , Zhu, Min , Zhang, Yanning in Basins , Catchment scale , Catchments

2021

The absence of a compiled large-scale catchment characteristics dataset is a key obstacle limiting the development of large-sample hydrology research in China. We introduce the first large-scale catchment attribute dataset in China. We compiled diverse data sources, including soil, land cover, climate, topography, and geology, to develop the dataset. The dataset also includes catchment-scale 31-year meteorological time series from 1990 to 2020 for each basin. Potential evapotranspiration time series based on Penman's equation are derived for each basin. The 4911 catchments included in the dataset cover all of China. We introduced several new indicators that describe the catchment geography and the underlying surface differently from previously proposed datasets. The resulting dataset has a total of 125 catchment attributes and includes a separate HydroMLYR (hydrology dataset for machine learning in the Yellow River Basin) dataset containing standardized weekly averaged streamflow for 102 basins in the Yellow River Basin. The standardized streamflow data should be able to support machine learning hydrology research in the Yellow River Basin. The dataset is freely available at https://doi.org/10.5281/zenodo.5729444 (Zhen et al., 2021). In addition, the accompanying code used to generate the dataset is freely available at https://github.com/haozhen315/CCAM-China-Catchment-Attributes-and-Meteorology-dataset (last access: 26 November 2021) and supports the generation of catchment characteristics for any custom basin boundaries. Compiled data for the 4911 basins covering all of China and the open-source code should be able to support the study of any selected basins rather than being limited to only a few basins.

Journal Article

Share this book

Add to My Shelf

Study on the changes in vegetation structural coverage and its response mechanism to hydrology

by Li, Tao , Xia, RunLiang , Liu, QiXing in Gullies , hydrological response relationship , Hydrology

2022

Vegetation is an important factor that affects the hydrological process of a watershed. In recent years, the vegetation in the hilly and gully regions of the Loess Plateau has undergone significant changes, which have greatly changed the relationship between rainfall and runoff and sediment in the region. A single vegetation cover index cannot represent the important impact of vegetation grade on the effectiveness of soil and water conservation. It is of great scientific significance to deeply study the influence of the vegetation structure change mechanism in the hilly and gully area on the hydrological process of the watershed. In this article, a typical watershed in the loess hilly and gully area is used as the research object, and the method of combining field sampling experiment and remote sensing inversion is used to establish a vegetation index remote sensing model reflecting the vegetation canopy cover and litter. The impact of changes in vegetation structure on hydrological processes is quantitatively assessed. The results show that the more annual precipitation in the basin, the more sensitive the runoff coefficient is to changes in structural vegetation index. The greater the rainfall intensity, the weaker the sensitivity of the sediment yield coefficient to changes in structural vegetation index. The use of remote sensing data to retrieve the underlying surface vegetation still has the problem of the scale effect. It is necessary to further use remote sensing data with a higher spectral resolution to carry out field observations at different scales to improve the applicability of this method in a wider range of watersheds.

Journal Article

Share this book

Add to My Shelf

Attributing trend in naturalized streamflow to temporally explicit vegetation change and climate variation in the Yellow River basin of China

by Wang, Zhihui , Sun, Pengcheng , Feng, Feng in Agricultural land , Annual precipitation , Climate

2022

The naturalized streamflow, i.e., streamflow without water management effects, in the Yellow River basin (YRB) has been significantly decreased at a rate of -3.71×108 m3 yr−1 during 1982–2018, although annual precipitation experienced an insignificantly positive trend. Explicit detection and attribution of naturalized streamflow are critical to manage limited water resources for the sustainable development of ecosystems and socio-economic systems. The effects from temporally explicit changes of climate variables and underlying surfaces on the streamflow trend were assessed using the variable infiltration capacity (VIC) model prescribed with continuously dynamic leaf area index (LAI) and land cover. The results show a sharp increase of the LAI trend and land use change as a conversion of cropland into forest grass in the basin. The decrease in naturalized streamflow can primarily be attributed to the vegetation changes including an interannual LAI increase and intra-annual LAI temporal pattern change, which account for the streamflow reduction of 1.99×108 and 0.45×108 m3 yr−1, respectively. The impacts of the LAI change are largest at the subregion of Longmen–Huayuankou where the LAI increasing trend is high and land use change is substantial. Attribution based on simulations with multiyear average LAI changes obviously underestimates the impacts of the interannual LAI change and intra-annual LAI temporal change on the natural streamflow trend. Overall, the effect of climate variation on streamflow is slight because the positive effect from precipitation and wind speed changes was offset by the negative effect from increasing temperature. Although climate variation is decisive for streamflow change, this study suggests that change in underlying surfaces has imposed a substantial trend on naturalized streamflow. This study improves the understanding of the spatiotemporal patterns and the underlying mechanisms of natural streamflow reduction across the YRB between 1982 and 2018.

Journal Article

Share this book

Add to My Shelf

A Parallel Computation and Web Visualization Framework for Rapid Large-scale Flood Risk Mapping

by Wang, Min , Xia, Runliang , Yang, Ming in Boundary conditions , Computation , Domains

2019

For rapid flood risk mapping, a key aspect is transferring the results of flood simulations for web visualization. The challenges here include: (1) large-scale and complicated modelling; (2) the need for fast computation to solve flood numerical models; and (3) effective web visualization. This paper tackles these challenges by introducing a web-based framework that can transfer the results of parallel flood simulation to a web visualization application. Flood depth data, velocity and flood arrival time are calculated using a parallel shallow water model with a predicted input flow. This is automatically transferred into shapefile data using ArcObjects components. Other data relating to the flood risk mapping is managed and served through a REST (Representational State Transfer) interface. The web visualization is performed using the ArcGIS for JavaScript API. We applied the framework to the floodplain of the lower Yellow River in China. The results show basic geo-information about the domain, the flood depth classified by color and economic loss through inundation of villages and farmland. Although the simulated domain was large and the boundary conditions complicated, the whole process from flood risk simulation to web visualization took less than 3 hours, which is enough time to increase flood preparedness.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter