Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
60,762
result(s) for
"image recognition"
Sort by:
Deep Residual Learning for Image Recognition: A Survey
2022
Deep Residual Networks have recently been shown to significantly improve the performance of neural networks trained on ImageNet, with results beating all previous methods on this dataset by large margins in the image classification task. However, the meaning of these impressive numbers and their implications for future research are not fully understood yet. In this survey, we will try to explain what Deep Residual Networks are, how they achieve their excellent results, and why their successful implementation in practice represents a significant advance over existing techniques. We also discuss some open questions related to residual learning as well as possible applications of Deep Residual Networks beyond ImageNet. Finally, we discuss some issues that still need to be resolved before deep residual learning can be applied on more complex problems.
Journal Article
CNN-RNN based method for license plate recognition
by
Lu, Tong
,
Tang, Dongqi
,
Asadzadehkaljahi, Maryam
in
(B6135E) Image recognition
,
(C5260B) Computer vision and image processing techniques
,
(C5290) Neural computing techniques
2018
Achieving good recognition results for License plates is challenging due to multiple adverse factors. For instance, in Malaysia, where private vehicle (e.g., cars) have numbers with dark background, while public vehicle (taxis/cabs) have numbers with white background. To reduce the complexity of the problem, we propose to classify the above two types of images such that one can choose an appropriate method to achieve better results. Therefore, in this work, we explore the combination of Convolutional Neural Networks (CNN) and Recurrent Neural Networks namely, BLSTM (Bi-Directional Long Short Term Memory), for recognition. The CNN has been used for feature extraction as it has high discriminative ability, at the same time, BLSTM has the ability to extract context information based on the past information. For classification, we propose Dense Cluster based Voting (DCV), which separates foreground and background for successful classification of private and public. Experimental results on live data given by MIMOS, which is funded by Malaysian Government and the standard dataset UCSD show that the proposed classification outperforms the existing methods. In addition, the recognition results show that the recognition performance improves significantly after classification compared to before classification.
Journal Article
Loop and distillation: Attention weights fusion transformer for fine‐grained representation
by
Meng, Zuqiang
,
Sek, Yong Wee
,
Fayou, Sun
in
computer vision
,
fine‐grained image recognition
,
Image classification
2023
Learning subtle discriminative feature representation plays a significant role in Fine‐Grained Visual Categorisation (FGVC). The vision transformer (ViT) achieves promising performance in the traditional image classification filed due to its multi‐head self‐attention mechanism. Unfortunately, ViT cannot effectively capture critical feature regions for FGVC due to only focusing on classification token and adopting the strategy of one‐time image input. Besides, the advantage of attention weights fusion is not applied to ViT. To promote the performance of capturing vital regions for FGVC, the authors propose a novel model named RDTrans, which proposes discriminative region with top priority in a recurrent learning way. Specifically, proposed vital regions at each scale will be cropped and amplified as the next input parameters to finally locate the most discriminative region. Furthermore, a distillation learning method is employed to provide better supervision for elevating the generalisation ability. Concurrently, RDTrans can be easily trained end‐to‐end in a weakly‐supervised learning way. Extensive experiments demonstrate that RDTrans yields state‐of‐the‐art performance on four widely used fine‐grained benchmarks, including CUB‐200‐2011, Stanford Cars, Stanford Dogs, and iNat2017. We fuse attention weight grouped by head to reinforce the attention of different regions. Subsequently, we adopt three attention weight fusion blocks and channel grouping methods to accurately select discriminative region. In addition, we utilise distillation learning method to transfer the learned knowledge by CNN from object to regions proposing. Finally, we gradually refine the discriminative regions in a recurrent way.
Journal Article
Automatic target recognition of synthetic aperture radar (SAR) images based on optimal selection of Zernike moments features
by
Amoon, Mehdi
,
Rezai-rad, Gholam-ali
in
Applied sciences
,
automatic target detection
,
automatic target recognition applications
2014
In the present study, a new algorithm for automatic target detection (ATR) in synthetic aperture radar (SAR) images has been proposed. First, moving and stationary target acquisition and recognition image chips have been segmented and then passed to a number of preprocessing stages such as histogram equalisation, position and size normalisation. Second, the feature extraction based on Zernike moments (ZMs) having linear transformation invariance properties and robustness in the presence of the noise has been introduced for the first time. Third, a genetic algorithm-based feature selection and a support vector machine classifier have been presented to select the optimal feature subset of ZMs for decreasing the computational complexity. Experimental results demonstrate the efficiency of the proposed approach in target recognition of SAR imagery. The authors obtained results show that just a small amount of ZMs features is sufficient to achieve the recognition rates that rival other established methods, and so ZMs features can be regarded as a powerful discriminatory feature for automatic target recognition applications relevant to SAR imagery. Furthermore, it can be observed that the classifier performs fairly well until the signal-to-noise ratio falls beneath 5 dB for noisy images.
Journal Article
Attention Guided Food Recognition via Multi-Stage Local Feature Fusion
2024
The task of food image recognition, a nuanced subset of fine-grained image recognition, grapples with substantial intra-class variation and minimal inter-class differences. These challenges are compounded by the irregular and multi-scale nature of food images. Addressing these complexities, our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion, grounded in the ConvNeXt architecture. Our model employs hybrid attention (HA) mechanisms to pinpoint critical discriminative regions within images, substantially mitigating the influence of background noise. Furthermore, it introduces a multi-stage local fusion (MSLF) module, fostering long-distance dependencies between feature maps at varying stages. This approach facilitates the assimilation of complementary features across scales, significantly bolstering the model’s capacity for feature extraction. Furthermore, we constructed a dataset named Roushi60, which consists of 60 different categories of common meat dishes. Empirical evaluation of the ETH Food-101, ChineseFoodNet, and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%, 82.86%, and 92.50%, respectively. These figures not only mark an improvement of 1.04%, 3.42%, and 1.36% over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods. Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition, setting a new benchmark for the field.
Journal Article
Inherent limitation of digital imagery: spatial-phase vacillations and the ambiguity function
2020
The spatial relationship of a scene image upon a pixilated focal-plane-array can temporally change due to scene or camera motion, microjitter of camera line-of-sight, etc. It was found that degradation caused by subpixel spatial-phase vacillations (SPV) upon the performance of matched filters was unexpectedly significant. Subpixel spatial-phase vacillations can cause degradations in matched filter performance to detect desired objects and discriminate other objects. SPV appear to be an inherent limitation of digital imagery when processed using matched filter methodology and can negatively impact the performance of systems. Mitigation of this degradation was found to be possible by utilizing one of several matched filter constructions such as a multi-filter enhanced matched filter (EMF) bank and a single-filter EMF. A significant conclusion of this investigation is that, for automatic target recognition applications, improved overall task performance should be realized by the use of an EMF bank. Dramatic reduction in computational resources for image matching multispectral imagery was accomplished by spectrally-collapsing multispectral imagery to form pseudoimages with the spectral information appearing as a texture in a grayscale image. The pattern recognition ambiguity function, which sets the fundamental performance limit of an image processing system, is introduced.
Journal Article
Circulating tumor cell detection and single‐cell analysis using an integrated workflow based on ChimeraX®‐i120 Platform: A prospective study
2021
We developed an integrated workflow for circulating tumor cell (CTC) detection and downstream single‐cell analysis based on a novel ChimeraX®‐i120 platform. The platform facilitates negative enrichment, immunofluorescent labeling, and machine learning‐based identification of CTCs. The CTC captured by the platform is also compatible for single‐cell molecular analysis. In this study, potential utility of our workflow was validated in clinical setting. Circulating tumor cell (CTC) analysis holds great potential to be a noninvasive solution for clinical cancer management. A complete workflow that combined CTC detection and single‐cell molecular analysis is required. We developed the ChimeraX®‐i120 platform to facilitate negative enrichment, immunofluorescent labeling, and machine learning‐based identification of CTCs. Analytical performances were evaluated, and a total of 477 participants were enrolled to validate the clinical feasibility of ChimeraX®‐i120 CTC detection. We analyzed copy number alteration profiles of isolated single cells. The ChimeraX®‐i120 platform had high sensitivity, accuracy, and reproducibility for CTC detection. In clinical samples, an average value of > 60% CTC‐positive rate was found for five cancer types (i.e., liver, biliary duct, breast, colorectal, and lung), while CTCs were rarely identified in blood from healthy donors. In hepatocellular carcinoma patients treated with curative resection, CTC status was significantly associated with tumor characteristics, prognosis, and treatment response (all P < 0.05). Single‐cell sequencing analysis revealed that heterogeneous genomic alteration patterns resided in different cells, patients, and cancers. Our results suggest that the use of this ChimeraX®‐i120 platform and the integrated workflow has validity as a tool for CTC detection and downstream genomic profiling in the clinical setting.
Journal Article
Cross-Cultural Comparison of Urban Green Space through Crowdsourced Big Data: A Natural Language Processing and Image Recognition Approach
2023
Understanding the relationship between environmental features and perceptions of urban green spaces (UGS) is crucial for UGS design and management. However, quantifying park perceptions on a large spatial and temporal scale is challenging, and it remains unclear which environmental features lead to different perceptions in cross-cultural comparisons. This study addressed this issue by collecting 11,782 valid social media comments and photos covering 36 UGSs from 2020 to 2022 using a Python 3.6-based crawler. Natural language processing and image recognition methods from Google were then utilized to quantify UGS perceptions. This study obtained 32 high-frequency feature words through sentiment analysis and quantified 17 environmental feature factors that emerged using object and scene recognition techniques for photos. The results show that users generally perceive Japanese UGSs as more positive than Chinese UGSs. Chinese UGS users prioritize plant green design and UGS user density, whereas Japanese UGS focuses on integrating specific cultural elements. Therefore, when designing and managing urban greenspace systems, local environmental and cultural characteristics must be considered to meet the needs of residents and visitors. This study offers a replicable and systematic approach for researchers investigating the utilization of UGS on a global scale.
Journal Article
Group Normalization
2020
Batch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train. However, normalizing along the batch dimension introduces problems—BN’s error increases rapidly when the batch size becomes smaller, caused by inaccurate batch statistics estimation. This limits BN’s usage for training larger models and transferring features to computer vision tasks including detection, segmentation, and video, which require small batches constrained by memory consumption. In this paper, we present Group Normalization (GN) as a simple alternative to BN. GN divides the channels into groups and computes within each group the mean and variance for normalization. GN’s computation is independent of batch sizes, and its accuracy is stable in a wide range of batch sizes. On ResNet-50 trained in ImageNet, GN has 10.6% lower error than its BN counterpart when using a batch size of 2; when using typical batch sizes, GN is comparably good with BN and outperforms other normalization variants. Moreover, GN can be naturally transferred from pre-training to fine-tuning. GN can outperform its BN-based counterparts for object detection and segmentation in COCO (https://github.com/facebookresearch/Detectron/blob/master/projects/GN), and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks. GN can be easily implemented by a few lines of code in modern libraries.
Journal Article