Catalogue Search | MBRL

Deep-Learning-Based Multispectral Image Reconstruction from Single Natural Color RGB Image—Enhancing UAV-Based Phenotyping

by Balram Marathi , Seishi Ninomiya , Wei Guo in [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] , [INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] , [SDE.IE]Environmental Sciences/Environmental Engineering

2022

Multispectral images (MSIs) are valuable for precision agriculture due to the extra spectral information acquired compared to natural color RGB (ncRGB) images. In this paper, we thus aim to generate high spatial MSIs through a robust, deep-learning-based reconstruction method using ncRGB images. Using the data from the agronomic research trial for maize and breeding research trial for rice, we first reproduced ncRGB images from MSIs through a rendering model, Model-True to natural color image (Model-TN), which was built using a benchmark hyperspectral image dataset. Subsequently, an MSI reconstruction model, Model-Natural color to Multispectral image (Model-NM), was trained based on prepared ncRGB (ncRGB-Con) images and MSI pairs, ensuring the model can use widely available ncRGB images as input. The integrated loss function of mean relative absolute error (MRAEloss) and spectral information divergence (SIDloss) were most effective during the building of both models, while models using the MRAEloss function were more robust towards variability between growing seasons and species. The reliability of the reconstructed MSIs was demonstrated by high coefficients of determination compared to ground truth values, using the Normalized Difference Vegetation Index (NDVI) as an example. The advantages of using “reconstructed” NDVI over Triangular Greenness Index (TGI), as calculated directly from RGB images, were illustrated by their higher capabilities in differentiating three levels of irrigation treatments on maize plants. This study emphasizes that the performance of MSI reconstruction models could benefit from an optimized loss function and the intermediate step of ncRGB image preparation. The ability of the developed models to reconstruct high-quality MSIs from low-cost ncRGB images will, in particular, promote the application for plant phenotyping in precision agriculture.

Journal Article

Share this book

Add to My Shelf

Deep steganographic approach for reliable data hiding using convolutional neural networks and adaptive loss optimization

by P., Malathi , T., Gireesh Kumar in 639/166 , 639/705 , Convolutional neural networks (CNNs)

2025

This paper proposes a deep steganography framework using a three-layered Convolutional Neural Network (CNN) architecture—preparation, hiding, and revealing networks—for robust data hiding. The preparation network employs 50, 10, and 5 filters (3 3, 4 4, 5 5) for edge feature extraction, followed by adaptive embedding in the hiding network. Optimized with four loss functions, Loss Function 3 (LF 3), integrating mean and variance terms, achieves a payload of 3–5 bits per pixel and improved PSNR across Tiny-ImageNet, Linnaeus 5 dataset, Sky-text dataset and RGB-BMP datasets. LF 3 ensures robustness against Gaussian noise, cropping, and rotation, with low detection rates in histogram, statistical, and CNN-based steganalysis. Compared to state-of-the-art Generative Adversarial Network (GAN) based methods, LF 3 offers higher payload-robustness balance and computational efficiency, advancing secure data hiding for applications like medical imaging.

Journal Article

Share this book

Add to My Shelf

A weighted difference loss approach for enhancing multi-label classification

by Azrifah Azmi Murad, Masrah , Azman, Azreen Bin , Hu, Qiong in 639/166 , 639/705 , Ablation

2025

Conventional multi-label classification methods often fail to capture the dynamic relationships and relative intensity shifts between labels, treating them as independent entities. This limitation is particularly detrimental in tasks like sentiment analysis where emotions co-occur in nuanced proportions. To address this, we introduce a novel Weighted Difference Loss (WDL) framework. WDL operates on three core principles: (1) transforming labels into a normalized distribution to model their relative proportions; (2) computing learnable, weighted differences across this distribution to explicitly capture inter-label dynamics and trends; and (3) employing a label-shuffling augmentation to ensure the model learns intrinsic, order-invariant relationships. Our framework not only achieves state-of-the-art performance on four public benchmarks, but more importantly, it substantially improves the recognition of minority classes. This demonstrates the framework’s ability to learn from sparse data by effectively leveraging the underlying label structure, offering a robust, loss-driven alternative to complex architectural modifications.

Journal Article

Share this book

Add to My Shelf

DP-MaizeTrack: a software for tracking the number of maize plants and leaves information from UAV image

by Chen, LongHao , Li, YingLun , Wang, ChuanYu in Accuracy , Agricultural management , Agricultural production

2025

In modern agricultural production, accurate monitoring of maize growth and leaf counting is crucial for precision management and crop breeding optimization. Current UAV-based methods for detecting maize seedlings and leaves often face challenges in achieving high accuracy due to issues such as low spatial-resolution, complex field environments, variations in plant scale and orientation. To address these challenges, this study develops an integrated detection and visualization software, DP-MaizeTrack, which incorporates the DP-YOLOv8 model based on YOLOv8. The DP-YOLOv8 model integrates three key improvements. The Multi-Scale Feature Enhancement (MSFE) module improves detection accuracy across different scales. The Optimized Spatial Pyramid Pooling–Fast (OSPPF) module enhances feature extraction in diverse field conditions. Experimental results in single-plant detection show that the DP-YOLOv8 model outperforms the baseline YOLOv8 with improvements of 3.9% in Precision (95.1%), 4.1% in Recall (91.5%), and 4.0% in mAP50 (94.9%). The software also demonstrates good accuracy in the visualization results for single-plant and leaf detection tasks. Furthermore, DP-MaizeTrack not only automates the detection process but also integrates agricultural analysis tools, including region segmentation and data statistics, to support precision agricultural management and leaf-age analysis. The source code and models are available at https://github.com/clhclhc/project .

Journal Article

Share this book

Add to My Shelf

Joint loan risk prediction based on deep learning‐optimized stacking model

by Wang, Yansong , Wang, Meng , Chen, Jian in convolution neural networks (CNN) , joint loan , loss function optimization

2024

In recent years, China's automobile industry has undergone rapid development, creating new opportunities for the auto loan industry. Currently, auto financing companies are actively seeking to expand their cooperation with banks. Therefore, improving the approval rate and scale of joint loan business is of significant practical importance. In this paper, we propose a Stacking‐based financial institution risk approval model and select the optimal stacking model by comparing its performance with other models. Additionally, we construct a bank approval model using deep learning techniques on a biased data set, with feature extraction performed using convolution neural networks (CNN) and feature‐based counterfactual augmentation used for balanced sampling. Finally, we optimize the model of the prediction of auto finance companies by selecting the optimal coefficients of loss function based on the features and results of the bank approval model. The proposed approach leads to an approximately 6% increase in the joint loan approval rate on the actual data set, as demonstrated by experimental results. In the context of joint loans, a bank approval model was built using a deep learning approach based on a dataset from an automotive finance company. Through methods such as label prediction, feature extraction, and loss function optimization, the performance of the stacked model for joint loans was enhanced.

Journal Article

Share this book

Add to My Shelf

Detection Method of Cow Estrus Behavior in Natural Scenes Based on Improved YOLOv5

by Zhang, Hongming , Gao, Ronghua , Gao, Zongzhi in Ablation , Accuracy , agriculture

2022

Natural breeding scenes have the characteristics of a large number of cows, complex lighting, and a complex background environment, which presents great difficulties for the detection of dairy cow estrus behavior. However, the existing research on cow estrus behavior detection works well in ideal environments with a small number of cows and has a low inference speed and accuracy in natural scenes. To improve the inference speed and accuracy of cow estrus behavior in natural scenes, this paper proposes a cow estrus behavior detection method based on the improved YOLOv5. By improving the YOLOv5 model, it has stronger detection ability for complex environments and multi-scale objects. First, the atrous spatial pyramid pooling (ASPP) module is employed to optimize the YOLOv5l network at multiple scales, which improves the model’s receptive field and ability to perceive global contextual multiscale information. Second, a cow estrus behavior detection model is constructed by combining the channel-attention mechanism and a deep-asymmetric-bottleneck module. Last, K-means clustering is performed to obtain new anchors and complete intersection over union (CIoU) is used to introduce the relative ratio between the predicted box of the cow mounting and the true box of the cow mounting to the regression box prediction function to improve the scale invariance of the model. Multiple cameras were installed in a natural breeding scene containing 200 cows to capture videos of cows mounting. A total of 2668 images were obtained from 115 videos of cow mounting events from the training set, and 675 images were obtained from 29 videos of cow mounting events from the test set. The training set is augmented by the mosaic method to increase the diversity of the dataset. The experimental results show that the average accuracy of the improved model was 94.3%, that the precision was 97.0%, and that the recall was 89.5%, which were higher than those of mainstream models such as YOLOv5, YOLOv3, and Faster R-CNN. The results of the ablation experiments show that ASPP, new anchors, C3SAB, and C3DAB designed in this study can improve the accuracy of the model by 5.9%. Furthermore, when the ASPP dilated convolution was set to (1,5,9,13) and the loss function was set to CIoU, the model had the highest accuracy. The class activation map function was utilized to visualize the model’s feature extraction results and to explain the model’s region of interest for cow images in natural scenes, which demonstrates the effectiveness of the model. Therefore, the model proposed in this study can improve the accuracy of the model for detecting cow estrus events. Additionally, the model’s inference speed was 71 frames per second (fps), which meets the requirements of fast and accurate detection of cow estrus events in natural scenes and all-weather conditions.

Journal Article

Share this book

Add to My Shelf

MSDT-Net: A Multi-Scale Smoothing Attention and Differential Transformer Encoding Network for Building Change Detection in Coastal Areas

by Yang, Lebao , Ji, Xue , Wei, Zheng in Accuracy , Analysis , Attention

2025

Island building change detection is a critical technology for environmental monitoring, disaster early warning, and urban planning, playing a key role in dynamic resource management and sustainable development of islands. However, the imbalanced distribution of class pixels (changed vs. unchanged) undermines the detection capability of existing methods and severe boundary misdetection. To address issue, we propose the MSDT-Net model, which makes breakthroughs in architecture, modules, and loss functions; a dual-branch twin ConvNeXt architecture is adopted as the feature extraction backbone, and the designed Edge-Aware Smoothing Module (MSA) effectively enhances the continuity of the change region boundaries through a multi-scale feature fusion mechanism. The proposed Difference Feature Enhancement Module (DTEM) enables deep interaction and fusion between original semantic and change features, significantly improving the discriminative power of the features. Additionally, a Focal–Dice–IoU Boundary Joint Loss Function (FDUB-Loss) is constructed to suppress massive background interference using Focal Loss, enhance pixel-level segmentation accuracy with Dice Loss, and optimize object localization with IoU Loss. Experiments show that on a self-constructed island dataset, the model achieves an F1-score of 0.9248 and an IoU value of 0.8614. Compared to mainstream methods, MSDT-Net demonstrates significant improvements in key metrics across various aspects. Especially in scenarios with few changed pixels, the recall rate is 0.9178 and the precision is 0.9328, showing excellent detection performance and boundary integrity. The introduction of MSDT-Net provides a highly reliable technical pathway for island development monitoring.

Journal Article

Share this book

Add to My Shelf

Multimodal Medical Image Fusion Using a Progressive Parallel Strategy Based on Deep Learning

by Luo, Yaohua , Peng, Peng in Attention , Computer vision , CT imaging

2025

Multimodal medical image fusion plays a critical role in enhancing diagnostic accuracy by integrating complementary information from different imaging modalities. However, existing methods often suffer from issues such as unbalanced feature fusion, structural blurring, loss of fine details, and limited global semantic modeling, particularly in low signal-to-noise modalities like PET. To address these challenges, we propose PPMF-Net, a novel progressive and parallel deep learning framework for PET–MRI image fusion. The network employs a hierarchical multi-path architecture to capture local details, global semantics, and high-frequency information in a coordinated manner. Specifically, it integrates three key modules: (1) a Dynamic Edge-Enhanced Module (DEEM) utilizing inverted residual blocks and channel attention to sharpen edge and texture features, (2) a Nonlinear Interactive Feature Extraction module (NIFE) that combines convolutional operations with element-wise multiplication to enable cross-modal feature coupling, and (3) a Transformer-Enhanced Global Modeling module (TEGM) with hybrid local–global attention to improve long-range dependency and structural consistency. A multi-objective unsupervised loss function is designed to jointly optimize structural fidelity, functional complementarity, and detail clarity. Experimental results on the Harvard MIF dataset demonstrate that PPMF-Net outperforms state-of-the-art methods across multiple metrics—achieving SF: 38.27, SD: 96.55, SCD: 1.62, and MS-SSIM: 1.14—and shows strong generalization and robustness in tasks such as SPECT–MRI and CT–MRI fusion, indicating its promising potential for clinical applications.

Journal Article

Share this book

Add to My Shelf

Adversarial graph node classification based on unsupervised learning and optimized loss functions

by Luo, Xiaohu , Ding, Hongli , Ma, Zhao in Accuracy , Artificial Intelligence , Classification

2024

The research field of this paper is unsupervised learning in machine learning, aiming to address the problem of how to simultaneously resist feature attacks and improve model classification performance in unsupervised learning. For this purpose, this paper proposes a method to add an optimized loss function after the graph encoding and representation stage. When the samples are relatively balanced, we choose the cross-entropy loss function for classification. When difficult-to-classify samples appear, an optimized Focal Loss*() function is used to adjust the weights of these samples, to solve the problem of imbalanced positive and negative samples during training. The developed method achieved superior performance accuracy with the values of 0.721 on the Cora dataset, 0.598 on the Citeseer dataset,0.862 on the Polblogs dataset. Moreover, the testing accuracy value achieved by optimized model is 0.745, 0.627, 0.892 on the three benchmark datasets, respectively. Experimental results show that the proposed method effectively improves the robustness of adversarial training models in downstream tasks and reduces potential interference with original data. All the test results are validated with the k-fold cross validation method in order to make an assessment of the generalizability of these results.

Journal Article

Share this book

Add to My Shelf

Real-time fault detection in multirotor UAVs using lightweight deep learning and high-fidelity simulation data with single and double fault magnitudes

by Sohel, Ferdous , Mowla, Md. Najmul , Asadi, Davood in Accuracy , Actuators , Artificial neural networks

2026

Robust fault detection and diagnosis (FDD) in multirotor unmanned aerial vehicles (UAVs) remains challenging due to limited actuator redundancy, nonlinear dynamics, and environmental disturbances. This work introduces two lightweight deep learning architectures: the Convolutional-LSTM Fault Detection Network (CLFDNet), which combines multi-scale one-dimensional convolutional neural networks (1D-CNN), long short-term memory (LSTM) units, and an adaptive attention mechanism for spatio-temporal fault feature extraction; and the Autoencoder LSTM Multi-loss Fusion Network (AELMFNet), a soft attention–enhanced LSTM autoencoder optimized via multi-loss fusion for fine-grained fault severity estimation. Both models are trained and evaluated on UAV-Fault Magnitude V1, a high-fidelity simulation dataset containing 114,230 labeled samples with motor degradation levels ranging from 5% to 40% in the take-off, hover, navigation, and descent phases, representing the most probable and recoverable fault scenarios in quadrotor UAVs. Including coupled faults enables models to learn correlated degradation patterns and actuator interactions while maintaining controllability under standard flight laws. CLFDNet achieves 96.81% precision in fault severity classification and 100% accuracy in motor fault localization with only 19.6K parameters, demonstrating suitability for real-time onboard applications. AELMFNet achieves the lowest reconstruction loss of 0.001 with Huber loss and an inference latency of 6 ms/step, underscoring its efficiency for embedded deployment. Comparative experiments against 15 baselines, including five classical machine learning models, five state-of-the-art fault detection methods, and five attention-based deep learning variants, validate the effectiveness of the proposed architectures. These findings confirm that lightweight deep models enable accurate and efficient diagnosis of UAV faults with minimal sensing.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter