Catalogue Search | MBRL

A Survey of Object Detection for UAVs Based on Deep Learning

by Tang, Guangyi , Zhao, Yonghao , Ni, Jianjun in Aerial photography , Algorithms , Altitude

2024

With the rapid development of object detection technology for unmanned aerial vehicles (UAVs), it is convenient to collect data from UAV aerial photographs. They have a wide range of applications in several fields, such as monitoring, geological exploration, precision agriculture, and disaster early warning. In recent years, many methods based on artificial intelligence have been proposed for UAV object detection, and deep learning is a key area in this field. Significant progress has been achieved in the area of deep-learning-based UAV object detection. Thus, this paper presents a review of recent research on deep-learning-based UAV object detection. This survey provides an overview of the development of UAVs and summarizes the deep-learning-based methods in object detection for UAVs. In addition, the key issues in UAV object detection are analyzed, such as small object detection, object detection under complex backgrounds, object rotation, scale change, and category imbalance problems. Then, some representative solutions based on deep learning for these issues are summarized. Finally, future research directions in the field of UAV object detection are discussed.

Journal Article

Share this book

Add to My Shelf

A Small-Object Detection Model Based on Improved YOLOv8s for UAV Image Scenarios

by Wang, Tingting , Zhu, Shengjie , Tang, Guangyi in Accuracy , Aerial photography , altitude

2024

Small object detection for unmanned aerial vehicle (UAV) image scenarios is a challenging task in the computer vision field. Some problems should be further studied, such as the dense small objects and background noise in high-altitude aerial photography images. To address these issues, an enhanced YOLOv8s-based model for detecting small objects is presented. The proposed model incorporates a parallel multi-scale feature extraction module (PMSE), which enhances the feature extraction capability for small objects by generating adaptive weights with different receptive fields through parallel dilated convolution and deformable convolution, and integrating the generated weight information into shallow feature maps. Then, a scale compensation feature pyramid network (SCFPN) is designed to integrate the spatial feature information derived from the shallow neural network layers with the semantic data extracted from the higher layers of the network, thereby enhancing the network’s capacity for representing features. Furthermore, the largest-object detection layer is removed from the original detection layers, and an ultra-small-object detection layer is applied, with the objective of improving the network’s detection performance for small objects. Finally, the WIOU loss function is employed to balance high- and low-quality samples in the dataset. The results of the experiments conducted on the two public datasets illustrate that the proposed model can enhance the object detection accuracy in UAV image scenarios.

Journal Article

Share this book

Add to My Shelf

A Review on Medical Textual Question Answering Systems Based on Deep Learning Approaches

by Tang, Guangyi , Mutabazi, Emmanuel , Ni, Jianjun in Computers , Deep learning , deep neural networks

2021

The advent of Question Answering Systems (QASs) has been envisaged as a promising solution and an efficient approach for retrieving significant information over the Internet. A considerable amount of research work has focused on open domain QASs based on deep learning techniques due to the availability of data sources. However, the medical domain receives less attention due to the shortage of medical datasets. Although Electronic Health Records (EHRs) are empowering the field of Medical Question-Answering (MQA) by providing medical information to answer user questions, the gap is still large in the medical domain, especially for textual-based sources. Therefore, in this study, the medical textual question-answering systems based on deep learning approaches were reviewed, and recent architectures of MQA systems were thoroughly explored. Furthermore, an in-depth analysis of deep learning approaches used in different MQA system tasks was provided. Finally, the different critical challenges posed by MQA systems were highlighted, and recommendations to effectively address them in forthcoming MQA systems were given out.

Journal Article

Share this book

Add to My Shelf

An Improved Attention-based Bidirectional LSTM Model for Cyanobacterial Bloom Prediction

by Tang, Guangyi , Xie, Yingjuan , Liu, Ruping in Artificial neural networks , Chlorophyll , Feature extraction

2022

Cyanobacterial blooms are one of the most serious water pollution problems for freshwater lakes. The treatment of blooms requires a lot of material and financial resources, so an early accurate prediction of cyanobacterial blooms is a very important way to deal with the outbreak of them. But it is challenging to predict the cyanobacterial blooms due to the uncertainty and complexity of their growth process. To deal with this problem, an improved attention-based bidirectional long short-term memory (LSTM) model is proposed in this paper, to make multistep predictions of chlorophyll-a concentration, which is a recognized characterization of algae activity. Firstly, the convolutional neural network (CNN) is used to extract data features and spatiotemporal correlation. Secondly, the bidirectional LSTM network (BiLSTM) is used to predict the concentration of chlorophyll-a based on the extracted features. Finally, the attention mechanism is used to calculate the weights for the characteristic factors that affect the chlorophyll-a concentration. At last, some experiments are carried out based on the real monitoring data of a platform in the Taihu Lake area. Compared with the prediction results of the other four state-of-the-art deep learning methods, the results show that the proposed method in this paper has the highest prediction accuracy.

Journal Article

Share this book

Add to My Shelf

Collaborative Filtering Recommendation Algorithm Based on TF-IDF and User Characteristics

by Cai, Yu , Tang, Guangyi , Xie, Yingjuan in Accuracy , Algorithms , Cold

2021

The recommendation algorithm is a very important and challenging issue for a personal recommender system. The collaborative filtering recommendation algorithm is one of the most popular and effective recommendation algorithms. However, the traditional collaborative filtering recommendation algorithm does not fully consider the impact of popular items and user characteristics on the recommendation results. To solve these problems, an improved collaborative filtering algorithm is proposed, which is based on the Term Frequency-Inverse Document Frequency (TF-IDF) method and user characteristics. In the proposed algorithm, an improved TF-IDF method is used to calculate the user similarity on the basis of rating data first. Secondly, the multi-dimensional characteristics information of users is used to calculate the user similarity by a fuzzy membership method. Then, the above two user similarities are fused based on an adaptive weighted algorithm. Finally, some experiments are conducted on the movie public data set, and the experimental results show that the proposed method has better performance than that of the state of the art.

Journal Article

Share this book

Add to My Shelf

An Improved Model for Medical Forum Question Classification Based on CNN and BiLSTM

by Tang, Guangyi , Mutabazi, Emmanuel , Ni, Jianjun in Accuracy , Analysis , bidirectional LSTM network

2023

Question Classification (QC) is the fundamental task for Question Answering Systems (QASs) implementation, and is a vital task, as it helps in identifying the question category. It plays a big role in predicting the answer to a question while building a QAS. However, classifying medical questions is still a challenging task due to the complexity of medical terms. Many researchers have proposed different techniques to solve these problems, but some of these problems remain partially solved or unsolved. With the help of deep learning technology, various text-processing problems have become much easier to solve. In this paper, an improved deep learning-based model for Medical Forum Question Classification (MFQC) is proposed to classify medical questions. In the proposed model, feature representation is performed using Word2Vec, which is a word embedding model. Additionally, the features are extracted from the word embedding layer based on Convolutional Neural Networks (CNNs). Finally, a Bidirectional Long Short Term Memory (BiLSTM) network is used to classify the extracted features. The BiLSTM model analyzes the target information of the representation and then outputs the question category via a SoftMax layer. Our model achieves state-of-the-art performance by effectively capturing semantic and syntactic features from the input questions. We evaluate the proposed CNN-BiLSTM model on two benchmark datasets and compare its performance with existing methods, demonstrating its superiority in accurately categorizing medical forum questions.

Journal Article

Share this book

Add to My Shelf

An Improved Visual SLAM Based on Map Point Reliability under Dynamic Environments

by Wang, Xiaotian , Tang, Guangyi , Ni, Jianjun in Algorithms , Cameras , Deep learning

2023

The visual simultaneous localization and mapping (SLAM) method under dynamic environments is a hot and challenging issue in the robotic field. The oriented FAST and Rotated BRIEF (ORB) SLAM algorithm is one of the most effective methods. However, the traditional ORB-SLAM algorithm cannot perform well in dynamic environments due to the feature points of dynamic map points at different timestamps being incorrectly matched. To deal with this problem, an improved visual SLAM method built on ORB-SLAM3 is proposed in this paper. In the proposed method, an improved new map points screening strategy and the repeated exiting map points elimination strategy are presented and combined to identify obvious dynamic map points. Then, a concept of map point reliability is introduced in the ORB-SLAM3 framework. Based on the proposed reliability calculation of the map points, a multi-period check strategy is used to identify the unobvious dynamic map points, which can further deal with the dynamic problem in visual SLAM, for those unobvious dynamic objects. Finally, various experiments are conducted on the challenging dynamic sequences of the TUM RGB-D dataset to evaluate the performance of our visual SLAM method. The experimental results demonstrate that our SLAM method can run at an average time of 17.51 ms per frame. Compared with ORB-SLAM3, the average RMSE of the absolute trajectory error (ATE) of the proposed method in nine dynamic sequences of the TUM RGB-D dataset can be reduced by 63.31%. Compared with the real-time dynamic SLAM methods, the proposed method can obtain state-of-the-art performance. The results prove that the proposed method is a real-time visual SLAM, which is effective in dynamic environments.

Journal Article

Share this book

Add to My Shelf

An Improved Transfer Learning Model for Cyanobacterial Bloom Concentration Prediction

by Tang, Guangyi , Liu, Ruping , Ni, Jianjun in Algae , Aquatic ecosystems , Biomass

2022

The outbreak of cyanobacterial blooms is a serious water environmental problem, and the harm it brings to aquatic ecosystems and water supply systems cannot be underestimated. It is very important to establish an accurate prediction model of cyanobacterial bloom concentration, which is a challenging issue. Machine learning techniques can improve the prediction accuracy, but a large amount of historical monitoring data is needed to train these models. For some waters with an inconvenient geographical location or frequent sensor failures, there are not enough historical data to train the model. To deal with this problem, a fused model based on a transfer learning method is proposed in this paper. In this study, the data of water environment with a large amount of historical monitoring data are taken as the source domain in order to learn the knowledge of cyanobacterial bloom growth characteristics and train the prediction model. The data of the water environment with a small amount of historical monitoring data are taken as the target domain in order to load the model trained in the source domain. Then, the training set of the target domain is used to participate in the inter-layer fine-tuning training of the model to obtain the transfer learning model. At last, the transfer learning model is fused with a convolutional neural network to obtain the prediction model. Various experiments are conducted for a 2 h prediction on the test set of the target domain. The results show that the proposed model can significantly improve the prediction accuracy of cyanobacterial blooms for the water environment with a low data volume.

Journal Article

Share this book

Add to My Shelf

An improved cross-domain sequential recommendation model based on intra-domain and inter-domain contrastive learning

by Zhao, Yonghao , Tang, Guangyi , Ni, Jianjun in Accuracy , Algorithms , Artificial intelligence

2024

Cross-domain recommendation aims to integrate data from multiple domains and introduce information from source domains, thereby achieving good recommendations on the target domain. Recently, contrastive learning has been introduced into the cross-domain recommendations and has obtained some better results. However, most cross-domain recommendation algorithms based on contrastive learning suffer from the bias problem. In addition, the correlation between the user’s single-domain and cross-domain preferences is not considered. To address these problems, a new recommendation model is proposed for cross-domain scenarios based on intra-domain and inter-domain contrastive learning, which aims to obtain unbiased user preferences in cross-domain scenarios and improve the recommendation performance of both domains. Firstly, a network enhancement module is proposed to capture users’ complete preference by applying a graphical convolution and attentional aggregator. This module can reduce the limitations of only considering user preferences in a single domain. Then, a cross-domain infomax objective with noise contrast is presented to ensure that users’ single-domain and cross-domain preferences are correlated closely in sequential interactions. Finally, a joint training strategy is designed to improve the recommendation performances of two domains, which can achieve unbiased cross-domain recommendation results. At last, extensive experiments are conducted on two real-world cross-domain scenarios. The experimental results show that the proposed model in this paper achieves the best recommendation results in comparison with existing models.

Journal Article

Share this book

Add to My Shelf

An Improved Sequential Recommendation Algorithm based on Short-Sequence Enhancement and Temporal Self-Attention Mechanism

by Cai, Yu , Tang, Guangyi , Ni, Jianjun in Algorithms , Computational linguistics , Datasets

2022

Sequential recommendation algorithm can predict the next action of a user by modeling the user’s interaction sequence with an item. However, most sequential recommendation models only consider the absolute positions of items in the sequence, ignoring the time interval information between items, and cannot effectively mine user preference changes. In addition, existing models perform poorly on sparse data sets, which make a poor prediction effect for short sequences. To address the above problems, an improved sequential recommendation algorithm based on short-sequence enhancement and temporal self-attention mechanism is proposed in this paper. In the proposed algorithm, a backward prediction model is trained first, to predict the prior items in the user sequence. Then, the reverse prediction model is used to generate a batch of pseudo-historical items before the initial items of the short sequence, to achieve the goal of enhancing the short sequence. Finally, the absolute position information and time interval information of the user sequence are modeled, and a time-aware self-attention model is adopted to predict the user’s next action and generate a recommendation list. Various experiments are conducted on two public data sets. The experimental results show that the method proposed in this paper has excellent performance on both dense and sparse data sets, and its effect is better than that of the state of the art.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter