Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
10,836
result(s) for
"object classification"
Sort by:
Deep Feature Fusion Based Dual Branch Network for X-ray Security Inspection Image Classification
2021
Automatic computer security inspection of X-ray scanned images has an irresistible trend in modern life. Aiming to address the inconvenience of recognizing small-sized prohibited item objects, and the potential class imbalance within multi-label object classification of X-ray scanned images, this paper proposes a deep feature fusion model-based dual branch network architecture. Firstly, deep feature fusion is a method to fuse features extracted from several model layers. Specifically, it operates these features by upsampling and dimension reduction to match identical sizes, then fuses them by element-wise sum. In addition, this paper introduces focal loss to handle class imbalance. For balancing importance on samples of minority and majority class, it assigns weights to class predictions. Additionally, for distinguishing difficult samples from easy samples, it introduces modulating factor. Dual branch network adopts the two components above and integrates them in final loss calculation through the weighted sum. Experimental results illustrate that the proposed method outperforms baseline and state-of-art by a large margin on various positive/negative ratios of datasets. These demonstrate the competitivity of the proposed method in classification performance and its potential application under actual circumstances.
Journal Article
A Review of Machine Learning and Deep Learning for Object Detection, Semantic Segmentation, and Human Action Recognition in Machine and Robotic Vision
by
Manakitsa, Nikoleta
,
Fragulis, George F.
,
Maraslidis, George S.
in
Algorithms
,
Artificial intelligence
,
Autonomous vehicles
2024
Machine vision, an interdisciplinary field that aims to replicate human visual perception in computers, has experienced rapid progress and significant contributions. This paper traces the origins of machine vision, from early image processing algorithms to its convergence with computer science, mathematics, and robotics, resulting in a distinct branch of artificial intelligence. The integration of machine learning techniques, particularly deep learning, has driven its growth and adoption in everyday devices. This study focuses on the objectives of computer vision systems: replicating human visual capabilities including recognition, comprehension, and interpretation. Notably, image classification, object detection, and image segmentation are crucial tasks requiring robust mathematical foundations. Despite the advancements, challenges persist, such as clarifying terminology related to artificial intelligence, machine learning, and deep learning. Precise definitions and interpretations are vital for establishing a solid research foundation. The evolution of machine vision reflects an ambitious journey to emulate human visual perception. Interdisciplinary collaboration and the integration of deep learning techniques have propelled remarkable advancements in emulating human behavior and perception. Through this research, the field of machine vision continues to shape the future of computer systems and artificial intelligence applications.
Journal Article
Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification
by
Rueckauer, Bodo
,
Lungu, Iulia-Alexandra
,
Pfeiffer, Michael
in
artificial neural network
,
Biochips
,
Classification
2017
neural networks (SNNs) can potentially offer an efficient way of doing inference because the neurons in the networks are sparsely activated and computations are event-driven. Previous work showed that simple continuous-valued deep Convolutional Neural Networks (CNNs) can be converted into accurate spiking equivalents. These networks did not include certain common operations such as max-pooling, softmax, batch-normalization and Inception-modules. This paper presents spiking equivalents of these operations therefore allowing conversion of nearly arbitrary CNN architectures. We show conversion of popular CNN architectures, including VGG-16 and Inception-v3, into SNNs that produce the best results reported to date on MNIST, CIFAR-10 and the challenging ImageNet dataset. SNNs can trade off classification error rate against the number of available operations whereas deep continuous-valued neural networks require a fixed number of operations to achieve their classification error rate. From the examples of LeNet for MNIST and BinaryNet for CIFAR-10, we show that with an increase in error rate of a few percentage points, the SNNs can achieve more than 2x reductions in operations compared to the original CNNs. This highlights the potential of SNNs in particular when deployed on power-efficient neuromorphic spiking neuron chips, for use in embedded applications.
Journal Article
Long-Time Interval Satellite Image Analysis on Forest-Cover Changes and Disturbances around Protected Area, Zeya State Nature Reserve, in the Russian Far East
by
Borisova, Irina G.
,
Seino, Tatsuyuki
,
Khatancharoen, Chulabush
in
Air pollution
,
Amur region
,
Betula
2021
Boreal forest areas in the Russian Far East contained very large intact forests. This particular area is considered one of the most productive and diverse forests in the boreal biome of the world, and it is also home to many endangered species. Zeya State Nature Reserve is located at the southern margin of the boreal forest area in the Russian Far East and has rich fauna and flora. However, the forest in the region faced large-scale forest fires and clearcutting for timber recently. The information of disturbances is rarely understood. This study aimed to explore the effects of disturbance and forest dynamics around the reserve. Our study used two-year overlaid Landsat images from Landsat 5 Thematic Mapper (TM) and 8 Operational Land Imager (OLI), to generate forest-cover-change maps of 1988–1999, 1999–2010, and 2010–2016. In this paper, we analyze the direction of forest successional stages, to demonstrate the effectiveness of this protected area in terms of preventing human-based deforestation on the vegetation indices. The vegetation indices included the normalized burn ratio (NBR), the normalized difference vegetation index (NDVI), and the normalized difference water index (NDWI). The study provided information on the pattern of forest-cover change and disturbance area around the reserve. The NDWI was used to differentiate between water and non-water areas. The mean values of NBR and NDVI were calculated and determine the forest successional stages between burn, vegetation recovery, grass, mixed forest, oak forest, and birch and larch forest. The accuracy was assessed by using field measurements, field photos, and high-resolution images as references. Overall, our classification results have high accuracy for all three periods. The most disturbed area occurred during 2010–2016. The reserve was highly protected, with no human-disturbance activity. However, large areas from fire disturbance were found (137 km2) during 1999–2010. The findings also show a large area of disturbance, mostly located outside of the reserve. Mixed disturbance increased to almost 50 km2 during 2010–2016, in the buffer zone and outside of the reserve. We recommend future works to apply our methods to other ecosystems, to compare the forest dynamics and disturbance inside and outside the protected area.
Journal Article
Segment-before-Detect: Vehicle Detection and Classification through Semantic Segmentation of Aerial Images
by
Le Saux, Bertrand
,
Lefèvre, Sébastien
,
Audebert, Nicolas
in
Aversion learning
,
Classification
,
Computer Science
2017
Like computer vision before, remote sensing has been radically changed by the introduction of deep learning and, more notably, Convolution Neural Networks. Land cover classification, object detection and scene understanding in aerial images rely more and more on deep networks to achieve new state-of-the-art results. Recent architectures such as Fully Convolutional Networks can even produce pixel level annotations for semantic mapping. In this work, we present a deep-learning based segment-before-detect method for segmentation and subsequent detection and classification of several varieties of wheeled vehicles in high resolution remote sensing images. This allows us to investigate object detection and classification on a complex dataset made up of visually similar classes, and to demonstrate the relevance of such a subclass modeling approach. Especially, we want to show that deep learning is also suitable for object-oriented analysis of Earth Observation data as effective object detection can be obtained as a byproduct of accurate semantic segmentation. First, we train a deep fully convolutional network on the ISPRS Potsdam and the NZAM/ONERA Christchurch datasets and show how the learnt semantic maps can be used to extract precise segmentation of vehicles. Then, we show that those maps are accurate enough to perform vehicle detection by simple connected component extraction. This allows us to study the repartition of vehicles in the city. Finally, we train a Convolutional Neural Network to perform vehicle classification on the VEDAI dataset, and transfer its knowledge to classify the individual vehicle instances that we detected.
Journal Article
Unveiling vulnerabilities: evading YOLOv5 object detection through adversarial perturbations and steganography
by
Garg, Urvashi
,
Sharma, Gauri
in
Artificial neural networks
,
Classification
,
Computer Communication Networks
2024
In the realm of machine learning, a discernible surge in research has been observed, focusing on the development of adversarial perturbations with the intent to subvert the capabilities of Deep Neural Networks (DNNs), particularly in the context of object detection and classification. Despite the availability of cutting-edge systems such as the widely acclaimed You Look Only Once (YOLO)v5 model, renowned for its swift image and video classification and detection prowess, our research embarks on a distinctive course exposing the weakness of this detection model and how easily it can be manipulated. This paper seeks to highlight the weaknesses of one of the most advanced neural networks when subjected to carefully crafted adversarial attacks. Our method entails intentionally inserting adversarial perturbations into photos via image-in-image steganography, a technique that is essentially imperceptible to the human eye yet capable of significantly lowering YOLOv5’s confidence levels. This approach was carefully, evaluated on a Magnetic Resonance Imaging (MRI) dataset containing around 1100 brain pictures. A comparison between regular and encoded photos undergoing steganography unveiled a substantial decrease in precision values, plummeting from a noteworthy 0.711 to a mere 0.0346.
Journal Article
Application of Deep Learning on Millimeter-Wave Radar Signals: A Review
2021
The progress brought by the deep learning technology over the last decade has inspired many research domains, such as radar signal processing, speech and audio recognition, etc., to apply it to their respective problems. Most of the prominent deep learning models exploit data representations acquired with either Lidar or camera sensors, leaving automotive radars rarely used. This is despite the vital potential of radars in adverse weather conditions, as well as their ability to simultaneously measure an object’s range and radial velocity seamlessly. As radar signals have not been exploited very much so far, there is a lack of available benchmark data. However, recently, there has been a lot of interest in applying radar data as input to various deep learning algorithms, as more datasets are being provided. To this end, this paper presents a survey of various deep learning approaches processing radar signals to accomplish some significant tasks in an autonomous driving application, such as detection and classification. We have itemized the review based on different radar signal representations, as it is one of the critical aspects while using radar data with deep learning models. Furthermore, we give an extensive review of the recent deep learning-based multi-sensor fusion models exploiting radar signals and camera images for object detection tasks. We then provide a summary of the available datasets containing radar data. Finally, we discuss the gaps and important innovations in the reviewed papers and highlight some possible future research prospects.
Journal Article
A Survey on Deep Learning Based Segmentation, Detection and Classification for 3D Point Clouds
by
Avots, Egils
,
Anbarjafari, Gholamreza
,
Karabulut, Dogus
in
3D object classification
,
3D object detection
,
3D object recognition
2023
The computer vision, graphics, and machine learning research groups have given a significant amount of focus to 3D object recognition (segmentation, detection, and classification). Deep learning approaches have lately emerged as the preferred method for 3D segmentation problems as a result of their outstanding performance in 2D computer vision. As a result, many innovative approaches have been proposed and validated on multiple benchmark datasets. This study offers an in-depth assessment of the latest developments in deep learning-based 3D object recognition. We discuss the most well-known 3D object recognition models, along with evaluations of their distinctive qualities.
Journal Article
Melanoma diagnosis using deep learning techniques on dermatoscopic images
by
Garcia-Zapirain, Maria Begonya
,
Percybrooks, Winston Spencer
,
Jojoa Acosta, Mario Fernando
in
Algorithms
,
Artificial intelligence
,
Artificial neural networks
2021
Background
Melanoma has become more widespread over the past 30 years and early detection is a major factor in reducing mortality rates associated with this type of skin cancer. Therefore, having access to an automatic, reliable system that is able to detect the presence of melanoma via a dermatoscopic image of lesions and/or skin pigmentation can be a very useful tool in the area of medical diagnosis.
Methods
Among state-of-the-art methods used for automated or computer assisted medical diagnosis, attention should be drawn to Deep Learning based on Convolutional Neural Networks, wherewith segmentation, classification and detection systems for several diseases have been implemented. The method proposed in this paper involves an initial stage that automatically crops the region of interest within a dermatoscopic image using the Mask and Region-based Convolutional Neural Network technique, and a second stage based on a ResNet152 structure, which classifies lesions as either “benign” or “malignant”.
Results
Training, validation and testing of the proposed model was carried out using the database associated to the challenge set out at the 2017 International Symposium on Biomedical Imaging. On the test data set, the proposed model achieves an increase in accuracy and balanced accuracy of 3.66% and 9.96%, respectively, with respect to the best accuracy and the best sensitivity/specificity ratio reported to date for melanoma detection in this challenge. Additionally, unlike previous models, the specificity and sensitivity achieve a high score (greater than 0.8) simultaneously, which indicates that the model is good for accurate discrimination between benign and malignant lesion, not biased towards any of those classes.
Conclusions
The results achieved with the proposed model suggest a significant improvement over the results obtained in the state of the art as far as performance of skin lesion classifiers (malignant/benign) is concerned.
Journal Article
Deep models for multi-view 3D object recognition: a review
by
Jarraya, Salma Kammoun
,
Usman, Muhammad
,
Anwar, Saeed
in
Acknowledgment
,
Artificial Intelligence
,
Artificial neural networks
2024
This review paper focuses on the progress of deep learning-based methods for multi-view 3D object recognition. It covers the state-of-the-art techniques in this field, specifically those that utilize 3D multi-view data as input representation. The paper provides a comprehensive analysis of the pipeline for deep learning-based multi-view 3D object recognition, including the various techniques employed at each stage. It also presents the latest developments in CNN-based and transformer-based models for multi-view 3D object recognition. The review discusses existing models in detail, including the datasets, camera configurations, view selection strategies, pre-trained CNN architectures, fusion strategies, and recognition performance. Additionally, it examines various computer vision applications that use multi-view classification. Finally, it highlights future directions, factors impacting recognition performance, and trends for the development of multi-view 3D object recognition method.
Journal Article