Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
7,225 result(s) for "Encoders-Decoders"
Sort by:
End-to-End Change Detection for High Resolution Satellite Images Using Improved UNet
Change detection (CD) is essential to the accurate understanding of land surface changes using available Earth observation data. Due to the great advantages in deep feature representation and nonlinear problem modeling, deep learning is becoming increasingly popular to solve CD tasks in remote-sensing community. However, most existing deep learning-based CD methods are implemented by either generating difference images using deep features or learning change relations between pixel patches, which leads to error accumulation problems since many intermediate processing steps are needed to obtain final change maps. To address the above-mentioned issues, a novel end-to-end CD method is proposed based on an effective encoder-decoder architecture for semantic segmentation named UNet++, where change maps could be learned from scratch using available annotated datasets. Firstly, co-registered image pairs are concatenated as an input for the improved UNet++ network, where both global and fine-grained information can be utilized to generate feature maps with high spatial accuracy. Then, the fusion strategy of multiple side outputs is adopted to combine change maps from different semantic levels, thereby generating a final change map with high accuracy. The effectiveness and reliability of our proposed CD method are verified on very-high-resolution (VHR) satellite image datasets. Extensive experimental results have shown that our proposed approach outperforms the other state-of-the-art CD methods.
Neural Machine Translation: A Review
The field of machine translation (MT), the automatic translation of written text from one natural language into another, has experienced a major paradigm shift in recent years. Statistical MT, which mainly relies on various count-based models and which used to dominate MT research for decades, has largely been superseded by neural machine translation (NMT), which tackles translation with a single neural network. In this work we will trace back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family. We will conclude with a short survey of more recent trends in the field.
Automatic Crack Detection on Road Pavements Using Encoder-Decoder Architecture
Automatic crack detection from images is an important task that is adopted to ensure road safety and durability for Portland cement concrete (PCC) and asphalt concrete (AC) pavement. Pavement failure depends on a number of causes including water intrusion, stress from heavy loads, and all the climate effects. Generally, cracks are the first distress that arises on road surfaces and proper monitoring and maintenance to prevent cracks from spreading or forming is important. Conventional algorithms to identify cracks on road pavements are extremely time-consuming and high cost. Many cracks show complicated topological structures, oil stains, poor continuity, and low contrast, which are difficult for defining crack features. Therefore, the automated crack detection algorithm is a key tool to improve the results. Inspired by the development of deep learning in computer vision and object detection, the proposed algorithm considers an encoder-decoder architecture with hierarchical feature learning and dilated convolution, named U-Hierarchical Dilated Network (U-HDN), to perform crack detection in an end-to-end method. Crack characteristics with multiple context information are automatically able to learn and perform end-to-end crack detection. Then, a multi-dilation module embedded in an encoder-decoder architecture is proposed. The crack features of multiple context sizes can be integrated into the multi-dilation module by dilation convolution with different dilatation rates, which can obtain much more cracks information. Finally, the hierarchical feature learning module is designed to obtain a multi-scale features from the high to low- level convolutional layers, which are integrated to predict pixel-wise crack detection. Some experiments on public crack databases using 118 images were performed and the results were compared with those obtained with other methods on the same images. The results show that the proposed U-HDN method achieves high performance because it can extract and fuse different context sizes and different levels of feature maps than other algorithms.
Streamflow modelling and forecasting for Canadian watersheds using LSTM networks with attention mechanism
This study investigates the capability of sequence-to-sequence machine learning (ML) architectures in an effort to develop streamflow forecasting tools for Canadian watersheds. Such tools are useful to inform local and region-specific water management and flood forecasting related activities. Two powerful deep-learning variants of the Recurrent Neural Network were investigated, namely the standard and attention-based encoder-decoder long short-term memory (LSTM) models. Both models were forced with past hydro-meteorological states and daily meteorological data with a look-back time window of several days. These models were tested for 10 different watersheds from the Ottawa River watershed, located within the Great Lakes Saint-Lawrence region of Canada, an economic powerhouse of the country. The results of training and testing phases suggest that both models are able to simulate overall hydrograph patterns well when compared to observational records. Between the two models, the attention model significantly outperforms the standard model in all watersheds, suggesting the importance and usefulness of the attention mechanism in ML architectures, not well explored for hydrological applications. The mean performance accuracy of the attention model on unseen data, when assessed in terms of mean Nash–Sutcliffe Efficiency and Kling-Gupta Efficiency is, respectively, found to be 0.985 and 0.954 for these watersheds. Streamflow forecasts with lead times of up to 5 days with the attention model demonstrate overall skillful performance with well above the benchmark accuracy of 70%. The results of the study suggest that the encoder–decoder LSTM, with attention mechanism, is a powerful modelling choice for developing streamflow forecasting systems for Canadian watersheds.
A comprehensive survey on image captioning: from handcrafted to deep learning-based techniques, a taxonomy and open research issues
Image captioning is a pretty modern area of the convergence of computer vision and natural language processing and is widely used in a range of applications such as multi-modal search, robotics, security, remote sensing, medical, and visual aid. The image captioning techniques have witnessed a paradigm shift from classical machine-learning-based approaches to the most contemporary deep learning-based techniques. We present an in-depth investigation of image captioning methodologies in this survey using our proposed taxonomy. Furthermore, the study investigates several eras of image captioning advancements, including template-based, retrieval-based, and encoder-decoder-based models. We also explore captioning in languages other than English. A thorough investigation of benchmark image captioning datasets and assessment measures is also discussed. The effectiveness of real-time image captioning is a severe barrier that prevents its use in sensitive applications such as visual aid, security, and medicine. Another observation from our research is the scarcity of personalized domain datasets that limits its adoption into more advanced issues. Despite influential contributions from several academics, further efforts are required to construct substantially robust and reliable image captioning models.
Flight trajectory prediction enabled by time-frequency wavelet transform
Accurate flight trajectory prediction is a crucial and challenging task in air traffic control, especially for maneuver operations. Modern data-driven methods are typically formulated as a time series forecasting task and fail to retain high accuracy. Meantime, as the primary modeling method for time series forecasting, frequency-domain analysis is underutilized in the flight trajectory prediction task. In this work, an innovative wavelet transform-based framework is proposed to perform time-frequency analysis of flight patterns to support trajectory forecasting. An encoder-decoder neural architecture is developed to estimate wavelet components, focusing on the effective modeling of global flight trends and local motion details. A real-world dataset is constructed to validate the proposed approach, and the experimental results demonstrate that the proposed framework exhibits higher accuracy than other comparative baselines, obtaining improved prediction performance in terms of four measurements, especially in the climb and descent phase with maneuver control. Most importantly, the time-frequency analysis is confirmed to be effective to achieve the flight trajectory prediction task. Accurate flight trajectory prediction can be a challenging task in air traffic control, especially for maneuver operations. Here, authors develop a time-frequency analysis based on an encoder-decoder neural architecture to estimate wavelet components and model global flight trends and local motion details.
CED-Net: Crops and Weeds Segmentation for Smart Farming Using a Small Cascaded Encoder-Decoder Architecture
Convolutional neural networks (CNNs) have achieved state-of-the-art performance in numerous aspects of human life and the agricultural sector is no exception. One of the main objectives of deep learning for smart farming is to identify the precise location of weeds and crops on farmland. In this paper, we propose a semantic segmentation method based on a cascaded encoder-decoder network, namely CED-Net, to differentiate weeds from crops. The existing architectures for weeds and crops segmentation are quite deep, with millions of parameters that require longer training time. To overcome such limitations, we propose an idea of training small networks in cascade to obtain coarse-to-fine predictions, which are then combined to produce the final results. Evaluation of the proposed network and comparison with other state-of-the-art networks are conducted using four publicly available datasets: rice seeding and weed dataset, BoniRob dataset, carrot crop vs. weed dataset, and a paddy–millet dataset. The experimental results and their comparisons proclaim that the proposed network outperforms state-of-the-art architectures, such as U-Net, SegNet, FCN-8s, and DeepLabv3, over intersection over union (IoU), F1-score, sensitivity, true detection rate, and average precision comparison metrics by utilizing only (1/5.74 × U-Net), (1/5.77 × SegNet), (1/3.04 × FCN-8s), and (1/3.24 × DeepLabv3) fractions of total parameters.
Magnetotelluric Time Series Denoising Using Encoder-Decoder Consisted of LSTM Cells
Electromagnetic signals in geophysics are frequently disturbed by various interference in field data acquisition. Denoising for passive electromagnetic methods such as magnetotelluric (MT) or audio magnetotelluric (AMT) data is significant to improve data quality and finally imaging to the geoelectrical structure. Conventionally, most denoising methods are employed in frequency domain and few of them are applied in time domain. However, a great number of irregular noise in the electromagnetic time series prove difficulty to be removed. We propose a denoising method, using Encoder-Decoder consisted of Long Short-Term Memory cells (ED-LSTM), to reduce the effect of the step noise and the random-impulsive noise. Supervised learning and transductive learning are used for the denoising of the step noise and the random-impulsive noise, respectively. Our results indicate that the step and random-impulsive noise could be successfully removed from the raw time series. The result indicate that ED-LSTM could potentially to be wildly used in the electromagnetic time series denoising and then improve data quality.
An Improved Encoder-Decoder Network Based on Strip Pool Method Applied to Segmentation of Farmland Vacancy Field
In the research of green vegetation coverage in the field of remote sensing image segmentation, crop planting area is often obtained by semantic segmentation of images taken from high altitude. This method can be used to obtain the rate of cultivated land in a region (such as a country), but it does not reflect the real situation of a particular farmland. Therefore, this paper takes low-altitude images of farmland to build a dataset. After comparing several mainstream semantic segmentation algorithms, a new method that is more suitable for farmland vacancy segmentation is proposed. Additionally, the Strip Pooling module (SPM) and the Mixed Pooling module (MPM), with strip pooling as their core, are designed and fused into the semantic segmentation network structure to better extract the vacancy features. Considering the high cost of manual data annotation, this paper uses an improved ResNet network as the backbone of signal transmission, and meanwhile uses data augmentation to improve the performance and robustness of the model. As a result, the accuracy of the proposed method in the test set is 95.6%, mIoU is 77.6%, and the error rate is 7%. Compared to the existing model, the mIoU value is improved by nearly 4%, reaching the level of practical application.
Transformers in Time-Series Analysis: A Tutorial
Transformer architectures have widespread applications, particularly in Natural Language Processing and Computer Vision. Recently, Transformers have been employed in various aspects of time-series analysis. This tutorial provides an overview of the Transformer architecture, its applications, and a collection of examples from recent research in time-series analysis. We delve into an explanation of the core components of the Transformer, including the self-attention mechanism, positional encoding, multi-head, and encoder/decoder. Several enhancements to the initial Transformer architecture are highlighted to tackle time-series tasks. The tutorial also provides best practices and techniques to overcome the challenge of effectively training Transformers for time-series analysis.