Catalogue Search | MBRL

End-to-End Change Detection for High Resolution Satellite Images Using Improved UNet

by Zhang, Yongjun , Peng, Daifeng , Guan, Haiyan in Architectural engineering , Change detection , Coders

2019

Change detection (CD) is essential to the accurate understanding of land surface changes using available Earth observation data. Due to the great advantages in deep feature representation and nonlinear problem modeling, deep learning is becoming increasingly popular to solve CD tasks in remote-sensing community. However, most existing deep learning-based CD methods are implemented by either generating difference images using deep features or learning change relations between pixel patches, which leads to error accumulation problems since many intermediate processing steps are needed to obtain final change maps. To address the above-mentioned issues, a novel end-to-end CD method is proposed based on an effective encoder-decoder architecture for semantic segmentation named UNet++, where change maps could be learned from scratch using available annotated datasets. Firstly, co-registered image pairs are concatenated as an input for the improved UNet++ network, where both global and fine-grained information can be utilized to generate feature maps with high spatial accuracy. Then, the fusion strategy of multiple side outputs is adopted to combine change maps from different semantic levels, thereby generating a final change map with high accuracy. The effectiveness and reliability of our proposed CD method are verified on very-high-resolution (VHR) satellite image datasets. Extensive experimental results have shown that our proposed approach outperforms the other state-of-the-art CD methods.

Journal Article

Share this book

Add to My Shelf

Neural Machine Translation: A Review

by Stahlberg, Felix in Artificial intelligence , Coders , Encoders-Decoders

2020

The field of machine translation (MT), the automatic translation of written text from one natural language into another, has experienced a major paradigm shift in recent years. Statistical MT, which mainly relies on various count-based models and which used to dominate MT research for decades, has largely been superseded by neural machine translation (NMT), which tackles translation with a single neural network. In this work we will trace back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family. We will conclude with a short survey of more recent trends in the field.

Journal Article

Share this book

Add to My Shelf

Automatic Crack Detection on Road Pavements Using Encoder-Decoder Architecture

by Wei, Jiahong , Di Mascio, Paola , Chen, Xiaopeng in Aircraft , Algorithms , Artificial intelligence

2020

Automatic crack detection from images is an important task that is adopted to ensure road safety and durability for Portland cement concrete (PCC) and asphalt concrete (AC) pavement. Pavement failure depends on a number of causes including water intrusion, stress from heavy loads, and all the climate effects. Generally, cracks are the first distress that arises on road surfaces and proper monitoring and maintenance to prevent cracks from spreading or forming is important. Conventional algorithms to identify cracks on road pavements are extremely time-consuming and high cost. Many cracks show complicated topological structures, oil stains, poor continuity, and low contrast, which are difficult for defining crack features. Therefore, the automated crack detection algorithm is a key tool to improve the results. Inspired by the development of deep learning in computer vision and object detection, the proposed algorithm considers an encoder-decoder architecture with hierarchical feature learning and dilated convolution, named U-Hierarchical Dilated Network (U-HDN), to perform crack detection in an end-to-end method. Crack characteristics with multiple context information are automatically able to learn and perform end-to-end crack detection. Then, a multi-dilation module embedded in an encoder-decoder architecture is proposed. The crack features of multiple context sizes can be integrated into the multi-dilation module by dilation convolution with different dilatation rates, which can obtain much more cracks information. Finally, the hierarchical feature learning module is designed to obtain a multi-scale features from the high to low- level convolutional layers, which are integrated to predict pixel-wise crack detection. Some experiments on public crack databases using 118 images were performed and the results were compared with those obtained with other methods on the same images. The results show that the proposed U-HDN method achieves high performance because it can extract and fuse different context sizes and different levels of feature maps than other algorithms.

Journal Article

Share this book

Add to My Shelf

Streamflow modelling and forecasting for Canadian watersheds using LSTM networks with attention mechanism

by Girihagama, Lakshika , Roy, René , Naveed Khaliq, Muhammad in Accuracy , Artificial Intelligence , Atmospheric models

2022

This study investigates the capability of sequence-to-sequence machine learning (ML) architectures in an effort to develop streamflow forecasting tools for Canadian watersheds. Such tools are useful to inform local and region-specific water management and flood forecasting related activities. Two powerful deep-learning variants of the Recurrent Neural Network were investigated, namely the standard and attention-based encoder-decoder long short-term memory (LSTM) models. Both models were forced with past hydro-meteorological states and daily meteorological data with a look-back time window of several days. These models were tested for 10 different watersheds from the Ottawa River watershed, located within the Great Lakes Saint-Lawrence region of Canada, an economic powerhouse of the country. The results of training and testing phases suggest that both models are able to simulate overall hydrograph patterns well when compared to observational records. Between the two models, the attention model significantly outperforms the standard model in all watersheds, suggesting the importance and usefulness of the attention mechanism in ML architectures, not well explored for hydrological applications. The mean performance accuracy of the attention model on unseen data, when assessed in terms of mean Nash–Sutcliffe Efficiency and Kling-Gupta Efficiency is, respectively, found to be 0.985 and 0.954 for these watersheds. Streamflow forecasts with lead times of up to 5 days with the attention model demonstrate overall skillful performance with well above the benchmark accuracy of 70%. The results of the study suggest that the encoder–decoder LSTM, with attention mechanism, is a powerful modelling choice for developing streamflow forecasting systems for Canadian watersheds.

Journal Article

Share this book

Add to My Shelf

An Improved Encoder-Decoder Network Based on Strip Pool Method Applied to Segmentation of Farmland Vacancy Field

by Qin, Yilang , Zhang, Xixin , Ning, Xin in Agricultural land , Algorithms , Coders

2021

In the research of green vegetation coverage in the field of remote sensing image segmentation, crop planting area is often obtained by semantic segmentation of images taken from high altitude. This method can be used to obtain the rate of cultivated land in a region (such as a country), but it does not reflect the real situation of a particular farmland. Therefore, this paper takes low-altitude images of farmland to build a dataset. After comparing several mainstream semantic segmentation algorithms, a new method that is more suitable for farmland vacancy segmentation is proposed. Additionally, the Strip Pooling module (SPM) and the Mixed Pooling module (MPM), with strip pooling as their core, are designed and fused into the semantic segmentation network structure to better extract the vacancy features. Considering the high cost of manual data annotation, this paper uses an improved ResNet network as the backbone of signal transmission, and meanwhile uses data augmentation to improve the performance and robustness of the model. As a result, the accuracy of the proposed method in the test set is 95.6%, mIoU is 77.6%, and the error rate is 7%. Compared to the existing model, the mIoU value is improved by nearly 4%, reaching the level of practical application.

Journal Article

Share this book

Add to My Shelf

Magnetotelluric Time Series Denoising Using Encoder-Decoder Consisted of LSTM Cells

by Wang, Sihao , Li, Liang , He, Lanfang in Audio data , Data acquisition , Denoising

2023

Electromagnetic signals in geophysics are frequently disturbed by various interference in field data acquisition. Denoising for passive electromagnetic methods such as magnetotelluric (MT) or audio magnetotelluric (AMT) data is significant to improve data quality and finally imaging to the geoelectrical structure. Conventionally, most denoising methods are employed in frequency domain and few of them are applied in time domain. However, a great number of irregular noise in the electromagnetic time series prove difficulty to be removed. We propose a denoising method, using Encoder-Decoder consisted of Long Short-Term Memory cells (ED-LSTM), to reduce the effect of the step noise and the random-impulsive noise. Supervised learning and transductive learning are used for the denoising of the step noise and the random-impulsive noise, respectively. Our results indicate that the step and random-impulsive noise could be successfully removed from the raw time series. The result indicate that ED-LSTM could potentially to be wildly used in the electromagnetic time series denoising and then improve data quality.

Journal Article

Share this book

Add to My Shelf

Flight trajectory prediction enabled by time-frequency wavelet transform

by Zhang, Zheng , Guo, Dongyue , Lin, Yi in 639/166/984 , 639/166/987 , Accuracy

2023

Accurate flight trajectory prediction is a crucial and challenging task in air traffic control, especially for maneuver operations. Modern data-driven methods are typically formulated as a time series forecasting task and fail to retain high accuracy. Meantime, as the primary modeling method for time series forecasting, frequency-domain analysis is underutilized in the flight trajectory prediction task. In this work, an innovative wavelet transform-based framework is proposed to perform time-frequency analysis of flight patterns to support trajectory forecasting. An encoder-decoder neural architecture is developed to estimate wavelet components, focusing on the effective modeling of global flight trends and local motion details. A real-world dataset is constructed to validate the proposed approach, and the experimental results demonstrate that the proposed framework exhibits higher accuracy than other comparative baselines, obtaining improved prediction performance in terms of four measurements, especially in the climb and descent phase with maneuver control. Most importantly, the time-frequency analysis is confirmed to be effective to achieve the flight trajectory prediction task. Accurate flight trajectory prediction can be a challenging task in air traffic control, especially for maneuver operations. Here, authors develop a time-frequency analysis based on an encoder-decoder neural architecture to estimate wavelet components and model global flight trends and local motion details.

Journal Article

Share this book

Add to My Shelf

Sharp U-Net: Depthwise convolutional network for biomedical image segmentation

by Ben Hamza, A. , Zunair, Hasib in Coders , Computer architecture , Coronaviruses

2021

The U-Net architecture, built upon the fully convolutional network, has proven to be effective in biomedical image segmentation. However, U-Net applies skip connections to merge semantically different low- and high-level convolutional features, resulting in not only blurred feature maps, but also over- and under-segmented target regions. To address these limitations, we propose a simple, yet effective end-to-end depthwise encoder-decoder fully convolutional network architecture, called Sharp U-Net, for binary and multi-class biomedical image segmentation. The key rationale of Sharp U-Net is that instead of applying a plain skip connection, a depthwise convolution of the encoder feature map with a sharpening kernel filter is employed prior to merging the encoder and decoder features, thereby producing a sharpened intermediate feature map of the same size as the encoder map. Using this sharpening filter layer, we are able to not only fuse semantically less dissimilar features, but also to smooth out artifacts throughout the network layers during the early stages of training. Our extensive experiments on six datasets show that the proposed Sharp U-Net model consistently outperforms or matches the recent state-of-the-art baselines in both binary and multi-class segmentation tasks, while adding no extra learnable parameters. Furthermore, Sharp U-Net outperforms baselines that have more than three times the number of learnable parameters. •We introduce a novel Sharp U‐Net architecture by designing new connections between the encoder and decoder subnetworks using a depthwise convolution of the encoder feature maps with a sharpening spatial filter to address the semantic gap issue between the encoder and decoder features.•We show that the Sharp U‐Net architecture can be scaled for improved performance, outperforming baselines that have three times the number of learnable parameters.•We demonstrate through extensive experiments the ability of the proposed model to learn efficient representations for both binary and multi‐class segmentation tasks on a variety of medical images from different modalities.

Journal Article

Share this book

Add to My Shelf

Transformers in Time-Series Analysis: A Tutorial

by Rasool, Ghulam , Tripathi, Aakash , Siddiqui, Shamoon in Architecture , Best practice , Computer vision

2023

Transformer architectures have widespread applications, particularly in Natural Language Processing and Computer Vision. Recently, Transformers have been employed in various aspects of time-series analysis. This tutorial provides an overview of the Transformer architecture, its applications, and a collection of examples from recent research in time-series analysis. We delve into an explanation of the core components of the Transformer, including the self-attention mechanism, positional encoding, multi-head, and encoder/decoder. Several enhancements to the initial Transformer architecture are highlighted to tackle time-series tasks. The tutorial also provides best practices and techniques to overcome the challenge of effectively training Transformers for time-series analysis.

Journal Article

Share this book

Add to My Shelf

Research on user granularity-level personalized social text generation technology

by Gao, J T , Ma, R , Yang, L D in Coders , Customization , Decoding

2022

With the introduction of large-scale pre-trained language models, breakthroughs have been made in text generation technology research. On this basis, in order to assist users to complete personalized creation, this paper proposes a user-level fine-grained control generation model. First, we design the Encoder-Decoder framework based on the GPT2 structure, and model and encode the user’s static personalized information on the Encoder side. Then add a bidirectional independent attention module to receive the personalized feature vector on the Decoder side. The attention module in the original GPT2 structure captures the dynamic personalized features in the user text, namely writing style, expression way, etc. Next, the scores of each attention module are weighted and fused to participate in the subsequent decoding to automatically generate social texts constrained by the user’s personalized feature attributes. However, the semantic sparsity of the user’s basic information will cause occasional conflicts between the generated text and some personalized features. Therefore, we use the Alignment module to perform the secondary enhancement and generation of consistent understanding between the output data of the Decoder and the user’s personalized features, and finally realize the personalized social text generation. Experiments show that compared with the GPT2 baseline model, the fluency of the model is improved by 0.3%-0.6%, and on the basis of no loss of language fluency, the social text generated by the model can have significant user personalization characteristics, among which personalization and consistency the two evaluation indicators of sexuality both increased significantly by 8.4% and 9%.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter