Catalogue Search | MBRL

HGCS-Det: A Deep Learning-Based Solution for Localizing and Recognizing Household Garbage in Complex Scenarios

by Wang, Qun , Zhang, Guangqun , He, Tao in Accuracy , Algorithms , attention-feature fusion

2025

With the rise of deep learning technology, intelligent garbage detection provides a new idea for garbage classification management. However, due to the interference of complex environments, coupled with the influence of the irregular features of garbage, garbage detection in complex scenarios still faces significant challenges. Moreover, some of the existing research suffer from shortcomings in either their precision or real-time performance, particularly when applied to complex garbage detection scenarios. Therefore, this paper proposes a model based on YOLOv8, namely HGCS-Det, for detecting garbage in complex scenarios. The HGCS-Det model is designed as follows: Firstly, the normalization attention module is introduced to calibrate the model’s attention to targets and to suppress the environmental noise interference information. Additionally, to weigh the attention-feature contributions, an Attention Feature Fusion module is employed to complement the attention weights of each channel. Subsequently, an Instance Boundary Reinforcement module is established to capture the fine-grained features of garbage by combining strong gradient information with semantic information. Finally, the Slide Loss function is applied to dynamically weight hard samples arising from the complex detection environments to improve the recognition accuracy of hard samples. With only a slight increase in parameters (3.02M), HGCS-Det achieves a 93.6% mean average precision (mAP) and 86 FPS on the public HGI30 dataset, which is a 3.33% higher mAP value than from YOLOv12, and outperforms the state-of-the-art (SOTA) methods in both efficiency and applicability. Notably, HGCS-Det maintains a lightweight architecture while enhancing the detection accuracy, enabling real-time performance even in resource-constrained environments. These characteristics significantly improve its practical applicability, making the model well suited for deployment in embedded devices and real-world garbage classification systems. This method can serve as a valuable technical reference for the engineering application of garbage classification.

Journal Article

Share this book

Add to My Shelf

A Multi-Scale Convolution and Multi-Layer Fusion Network for Remote Sensing Forest Tree Species Recognition

by Hu, Junguo , Hou, Jinjing , Hu, Haoji in Accuracy , Algorithms , Artificial intelligence

2023

Forest tree species identification in the field of remote sensing has become an important research topic. Currently, few research methods combine global and local features, making it challenging to accurately handle the similarity between different categories. Moreover, using a single deep layer for feature extraction overlooks the unique feature information at intermediate levels. This paper proposes a remote sensing image forest tree species classification method based on the Multi-Scale Convolution and Multi-Level Fusion Network (MCMFN) architecture. In the MCMFN network, the Shallow Multi-Scale Convolution Attention Combination (SMCAC) module replaces the original 7 × 7 convolution at the first layer of ResNet-50. This module uses multi-scale convolution to capture different receptive fields, and combines it with the attention mechanism to effectively enhance the ability of shallow features and obtain richer feature information. Additionally, to make efficient use of intermediate and deep-level feature information, the Multi-layer Selection Feature Fusion (MSFF) module is employed to improve classification accuracy. Experimental results on the Aerial forest dataset demonstrate a classification accuracy of 91.03%. The comprehensive experiments indicate the feasibility and effectiveness of the proposed MCMFN network.

Journal Article

Share this book

Add to My Shelf

Optimizing Backbone Networks Through Hybrid–Modal Fusion: A New Strategy for Waste Classification

by Wang, Qun , Zhang, Guangqun , He, Tao in Accuracy , Artificial intelligence , Automation

2025

With rapid urbanization, effective waste classification is a critical challenge. Traditional manual methods are time-consuming, labor-intensive, costly, and error-prone, resulting in reduced accuracy. Deep learning has revolutionized this field. Convolutional neural networks such as VGG and ResNet have dramatically improved automated sorting efficiency, and Transformer architectures like the Swin Transformer have further enhanced performance and adaptability in complex sorting scenarios. However, these approaches still struggle in complex environments and with diverse waste types, often suffering from limited recognition accuracy, poor generalization, or prohibitive computational demands. To overcome these challenges, we propose an efficient hybrid-modal fusion method, the Hybrid-modal Fusion Waste Classification Network (HFWC-Net), for precise waste image classification. HFWC-Net leverages a Transformer-based hierarchical architecture that integrates CNNs and Transformers, enhancing feature capture and fusion across varied image types for superior scalability and flexibility. By incorporating advanced techniques such as the Agent Attention mechanism and the LionBatch optimization strategy, HFWC-Net not only improves classification accuracy but also significantly reduces classification time. Comparative experimental results on the public datasets Garbage Classification, TrashNet, and our self-built MixTrash dataset demonstrate that HFWC-Net achieves Top-1 accuracy rates of 98.89%, 96.88%, and 94.35%, respectively. These findings indicate that HFWC-Net attains the highest accuracy among current methods, offering significant advantages in accelerating classification efficiency and supporting automated waste management applications.

Journal Article

Share this book

Add to My Shelf

A Method for Medical Microscopic Images’ Sharpness Evaluation Based on NSST and Variance by Combining Time and Frequency Domains

by Wu, Xuecheng , Zhang, Guangqun , He, Tao in Accuracy , Algorithms , autofocus

2022

An algorithm for a sharpness evaluation of microscopic images based on non-subsampled shearlet wave transform (NSST) and variance is proposed in the present study for the purpose of improving the noise immunity and accuracy of a microscope’s image autofocus. First, images are decomposed with the NSST algorithm; then, the decomposed sub-band images are subjected to variance to obtain the energy of the sub-band coefficients; and finally, the evaluation value is obtained from the ratio of the energy of the high- and low-frequency sub-band coefficients. The experimental results show that the proposed algorithm delivers better noise immunity performance than other methods reviewed by this study while maintaining high sensitivity.

Journal Article

Share this book

Add to My Shelf

Topic evolution based on the probabilistic topic model： a review

by Houkui ZHOU;Huimin YU;Roland HU in Computer Science , evaluation method , Evolution

2017

Accurately representing the quantity and characteristics of users＇ interest in certain topics is an important problem facing topic evolution researchers, particularly as it applies to modem online environments. Search engines can provide information retrieval for a specified topic from archived data, but fail to reflect changes in interest toward the topic over time in a structured way. This paper reviews notable research on topic evolution based on the probabilistic topic model from multiple aspects over the past decade. First, we introduce notations, terminology, and the basic topic model explored in the survey, then we summarize three categories of topic evolution based on the probabilistic topic model： the discrete time topic evolution model, the continuous time topic evolution model, and the online topic evolution model. Next, we describe applications of the topic evolution model and attempt to summarize model generalization performance evaluation and topic evolution evaluation methods, as well as providing comparative experimental results for different models. To conclude the review, we pose some open questions and discuss possible future research directions.

Journal Article

Share this book

Add to My Shelf

RCSFN: A remote sensing image scene classification and recognition network based on rectangle convolutional self attention fusion

by Hou, Jingjin , Hu, Haoji , Zhou, Houkui in Accuracy , Classification , Computer Imaging

2024

Remote sensing scene classification is a critical task in the processing and analysis of remote sensing images. Traditional methods typically use standard convolutional kernels to extract feature information. Although these methods have seen improvements, they still struggle to fully capture unique local details, thus affecting classification accuracy. Each category within remote sensing scenes has its unique local details, such as the rectangular features of buildings in schools or industrial areas, as well as bridges and roads in parks or squares. The most important features are often these rectangular structures and their spatial positions, which standard convolutional kernels find challenging to capture effectively.To address this issue, we propose a remote sensing scene classification method based on a Rectangle Convolution Self-Attention Fusion Network (RCSFN) architecture. In the RCSFN network, the Rectangle Convolution Maximum Fusion (RCMF) module operates in parallel with the first 4 × 4 convolutional layer of VanillaNet-5. The RCMF module uses two different rectangular convolutional kernels to extract different receptive fields, enhancing the extraction of shallow local features through addition and fusion. This process, combined with the concatenation of the original input features, results in richer local detail information.Additionally, we introduce an Area Selection (AS) module that focuses on selecting feature information within local regions. The Sequential Polarisation Self-Attention (SPS) mechanism, integrated with the Mini Region Convolution (MRC) module through feature multiplication, enhances important features and improves spatial positional relationships, thereby increasing the accuracy of recognising categories with rectangular or elongated features. Experiments were carried out on AID and NWPU-RESISC45 data sets, and the overall classification accuracy was 96.56% and 92.46%, respectively. This shows that the RCSFN network model proposed in this paper is feasible and effective for class classification problems with unique local detail features.

Journal Article

Share this book

Add to My Shelf

YOLO-MTG: a lightweight YOLO model for multi-target garbage detection

by Zhang, Guangqun , He, Tao , Xia, Zhongyi in Accuracy , Algorithms , Classification

2024

With wide adoption of deep learning technology in AI, intelligent garbage detection has become a hot research topic. However, existing datasets currently used for garbage detection rarely involves multi-category and multi-target garbage that are densely accumulated in actual garbage detection scenarios. In addition, many existing garbage detection models have such problems as low detection efficiency and difficulties in integration with resource-constrained devices. To address the above situations, this study proposes a lightweight YOLO model for multi-target garbage detection (YOLO-MTG). This model is designed as follows: firstly, MobileViTv3, a lightweight hybrid network, serves as the feature extraction network to encode global representations, enhancing the model's ability of discriminating dense targets. Secondly, MobileViT block, the feature extraction unit, is optimized with combination of EfficientFormer and dynamic convolution, aiming to enhance the model's feature extraction capability, focusing on essential feature information and reduce the redundancy in useless information. Finally, feature reuse techniques are deployed to reconstruct Neck to minimize the loss of channel information in the feature transmission process, and maintain the strong feature fusion ability of the model. The experimental results on the self-built multi-target garbage (MTG) dataset show that YOLO-MTG achieves 95.4% mean average precision (mAP) with only 3.4 M parameters, and it is superior to other state-of-the-art (SOTA) methods. This work contributes new insights into the field of garbage detection, aiming to advance garbage classification for practical engineering applications.

Journal Article

Share this book

Add to My Shelf

An Improved EfficientNetV2 Model Based on Visual Attention Mechanism: Application to Identification of Cassava Disease

by Zhang, Guangqun , He, Tao , Ye, Yuanbo in Accuracy , Agricultural production , Algorithms

2022

With the characteristic of high recognition rate and strong network robustness, convolutional neural network has now become the most mainstream method in the field of crop disease recognition. Aiming at the problems with insufficient numbers of labeled samples, complex backgrounds of sample images, and difficult extraction of useful feature information, a novel algorithm is proposed in this study based on attention mechanisms and convolutional neural networks for cassava leaf recognition. Specifically, a combined data augmentation strategy for datasets is used to prevent single distribution of image datasets, and then the PDRNet (plant disease recognition network) combining channel attention mechanism and spatial attention mechanism is proposed. The algorithm is designed as follows. Firstly, an attention module embedded in the network layer is deployed to establish remote dependence on each feature layer, strengthen the key feature information, and suppress the interference feature information, such as background noise. Secondly, a stochastic depth learning strategy is formulated to accelerate the training and inference of the network. And finally, a transfer learning method is adopted to load the pretrained weights into the model proposed in this study, with the recognition accuracy of the model enhanced by means of detailed parameter adjustments and dynamic changes in the learning rate. A large number of comparative experiments demonstrate that the proposed algorithm can deliver a recognition accuracy of 99.56% on the cassava disease image dataset, reaching the state-of-the-art level among CNN-based methods in terms of accuracy.

Journal Article

Share this book

Add to My Shelf

Deep Learning in Forest Tree Species Classification Using Sentinel-2 on Google Earth Engine: A Case Study of Qingyuan County

by Xu, Caiyao , Wang, Qun , He, Tao in Accuracy , Algorithms , Analysis

2023

Forest tree species information plays an important role in ecology and forest management, and deep learning has been used widely for remote sensing image classification in recent years. However, forest tree species classification using remote sensing images is still a difficult task. Since there is no benchmark dataset for forest tree species, a forest tree species dataset (FTSD) was built in this paper to fill the gap based on the Sentinel-2 images. The FTSD contained nine kinds of forest tree species in Qingyuan County with 8,815 images, each with a resolution of 64 × 64 pixels. The images were produced by combining forest management inventory data and Sentinel-2 images, which were acquired with less than 20% clouds from 1 April to 31 October, including the years 2017, 2018, 2019, 2020, and 2021. Then, the images were preprocessed and downloaded from Google Earth Engine (GEE). Four different band combinations were compared in the paper. Moreover, a Principal Component Analysis (PCA) and Normalized Difference Vegetation Index (NDVI) were also calculated using the GEE. Deep learning algorithms including DenseNet, EfficientNet, MobileNet, ResNet, and ShuffleNet were trained and validated in the FTSD. RGB images with red, green, and blue (PC1, PC2, and NDVI) obtained the highest validation accuracy in four band combinations. ResNet obtained the highest validation accuracy in all algorithms after 500 epochs were trained in the FTSD, which reached 84.91%. As a famous and widely used remote sensing classification satellite imagery dataset, NWPU RESISC-45 was also trained and validated in the paper. ResNet achieved a high validation accuracy of 87.90% after training 100 epochs in NWPU RESISC-45. The paper shows in forest tree species classification based on remote sensing images and deep learning that (1) PCA and NDVI can be combined to improve the accuracy of classification; (2) ResNet is more suitable than other deep learning algorithms including DenseNet, EfficientNet, MobileNet, and ShuffleNet in remote sensing classification; and (3) being too shallow or deep in ResNet does not perform better in the FTSD, that is, 50 layers are better than 34 and 101 layers.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter