Catalogue Search | MBRL

Image super-resolution reconstruction based on feature map attention mechanism

by Xia Runlong , Xie Jingbo , Gu Ke in Algorithms , Deep learning , Feature extraction

2021

To improve the issue of low-frequency and high-frequency components from feature maps being treated equally in existing image super-resolution reconstruction methods, the paper proposed an image super-resolution reconstruction method using attention mechanism with feature map to facilitate reconstruction from original low-resolution images to multi-scale super-resolution images. The proposed model consists of a feature extraction block, an information extraction block, and a reconstruction module. Firstly, the extraction block is used to extract useful features from low-resolution images, with multiple information extraction blocks being combined with the feature map attention mechanism and passed between feature channels. Secondly, the interdependence is used to adaptively adjust the channel characteristics to restore more details. Finally, the reconstruction module reforms different scales high-resolution images. The experimental results can demonstrate that the proposed method can effectively improve not only the visual effect of images but also the results on the Set5, Set14, Urban100, and Manga109. The results can demonstrate the proposed method has structurally similarity to the image reconstruction methods. Furthermore, the evaluating indicator of Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index (SSIM) has been improved to a certain degree, while the effectiveness of using feature map attention mechanism in image super-resolution reconstruction applications is useful and effective.

Journal Article

Share this book

Add to My Shelf

A RANDOM MATRIX APPROACH TO NEURAL NETWORKS

by Liao, Zhenyu , Couillet, Romain , Louart, Cosme in Continuity (mathematics) , Covariance matrix , Empirical analysis

2018

This article studies the Gram random matrix model G = 1 T Σ ⊺ Σ , Σ = σ ( W X ) , classically found in the analysis of random feature maps and random neural networks, where X = [x₁,...,xT ] ∈ ℝp×T is a (data) matrix of bounded norm, W ∈ ℝn×p is a matrix of independent zero-mean unit variance entries and σ : ℝ → ℝ is a Lipschitz continuous (activation) function—σ(WX) being understood entry-wise. By means of a key concentration of measure lemma arising from nonasymptotic random matrix arguments, we prove that, as n, p, T grow large at the same rate, the resolvent Q = (G + γIT )−1, for γ > 0, has a similar behavior as that met in sample covariance matrix models, involving notably the moment Φ = T n E[ G ] , which provides in passing a deterministic equivalent for the empirical spectral measure of G. Application-wise, this result enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters.

Journal Article

Share this book

Add to My Shelf

A Multi-Scale Feature Pyramid Network for Detection and Instance Segmentation of Marine Ships in SAR Images

by Meng, Chunning , Cheng, Jierong , Chang, Shengjiang in Algorithms , data collection , Datasets

2022

In the remote sensing field, synthetic aperture radar (SAR) is a type of active microwave imaging sensor working in all-weather and all-day conditions, providing high-resolution SAR images of objects such as marine ships. Detection and instance segmentation of marine ships in SAR images has become an important question in remote sensing, but current deep learning models cannot accurately quantify marine ships because of the multi-scale property of marine ships in SAR images. In this paper, we propose a multi-scale feature pyramid network (MS-FPN) to achieve the simultaneous detection and instance segmentation of marine ships in SAR images. The proposed MS-FPN model uses a pyramid structure, and it is mainly composed of two proposed modules, namely the atrous convolutional pyramid (ACP) module and the multi-scale attention mechanism (MSAM) module. The ACP module is designed to extract both the shallow and deep feature maps, and these multi-scale feature maps are crucial for the description of multi-scale marine ships, especially the small ones. The MSAM module is designed to adaptively learn and select important feature maps obtained from different scales, leading to improved detection and segmentation accuracy. Quantitative comparison of the proposed MS-FPN model with several classical and recently developed deep learning models, using the high-resolution SAR images dataset (HRSID) that contains multi-scale marine ship SAR images, demonstrated the superior performance of MS-FPN over other models.

Journal Article

Share this book

Add to My Shelf

Quantum Machine-Based Decision Support System for the Detection of Schizophrenia from EEG Records

in Algorithms , Children , Decision support systems

2024

Schizophrenia is a serious chronic mental disorder that significantly affects daily life. Electroencephalography (EEG), a method used to measure mental activities in the brain, is among the techniques employed in the diagnosis of schizophrenia. The symptoms of the disease typically begin in childhood and become more pronounced as one grows older. However, it can be managed with specific treatments. Computer-aided methods can be used to achieve an early diagnosis of this illness. In this study, various machine learning algorithms and the emerging technology of quantum-based machine learning algorithm were used to detect schizophrenia using EEG signals. The principal component analysis (PCA) method was applied to process the obtained data in quantum systems. The data, which were reduced in dimensionality, were transformed into qubit form using various feature maps and provided as input to the Quantum Support Vector Machine (QSVM) algorithm. Thus, the QSVM algorithm was applied using different qubit numbers and different circuits in addition to classical machine learning algorithms. All analyses were conducted in the simulator environment of the IBM Quantum Platform. In the classification of this EEG dataset, it is evident that the QSVM algorithm demonstrated superior performance with a 100% success rate when using Pauli X and Pauli Z feature maps. This study serves as proof that quantum machine learning algorithms can be effectively utilized in the field of healthcare.

Journal Article

Share this book

Add to My Shelf

Nested barycentric coordinate system as an explicit feature map for polyhedra approximation and learning tasks

by Gottlieb, Lee-Ad , Nivasch, Gabriel , Kontorovich, Aryeh in Algorithms , Approximation , Artificial Intelligence

2024

We introduce a new embedding technique based on a nested barycentric coordinate system. We show that our embedding can be used to transform the problems of polyhedron approximation, piecewise linear classification and convex regression into one of finding a linear classifier or regressor in a higher dimensional (but nevertheless quite sparse) representation. Our embedding maps a piecewise linear function into an everywhere-linear function, and allows us to invoke well-known algorithms for the latter problem to solve the former. We explain the applications of our embedding to the problems of approximating separating polyhedra—in fact, it can approximate any convex body and unions of convex bodies—as well as to classification by separating polyhedra, and to piecewise linear regression.

Journal Article

Share this book

Add to My Shelf

Study of Flame Detection based on Improved YOLOv4

by Tan, Xiaoyu , Huang, Xinyi , Zhang, Yongjun in Algorithms , Datasets , Feature extraction

2021

In some complex circumstances, the detection of conflagration mostly depends on smog detectors, which have lots of limitations in precision, efficiency and safety. If we make full use of object detection algorithms to detect the flame in industries, it will benefit people’s safety obviously. Among all kinds of object detection algorithms, YOLO series play a very significant role. In this paper, we propose an improving strategy on YOLOv4 to enhance its precision based on multi-scale feature maps. Firstly, we create flame datasets including almost 4000 high-resolution flame pictures. Secondly, some improvements on feature extraction network are made to detect smaller objects. Finally, the total algorithm are trained and tested on our datasets for about 400 epochs. The result show that the method can generate high quality on flame detection in a great number of situations.

Journal Article

Share this book

Add to My Shelf

Major advancements in kernel function approximation

in Approximation , Cognitive tasks , Computation

2021

Kernel based methods have become popular in a wide variety of machine learning tasks. They rely on the computation of kernel functions, which implicitly transform the data in its input space to data in a very high dimensional space. Efficient application of these functions have been subject to study in the last 10 years. The main focus was on improving the scalability of kernel based methods. In this regard, kernel function approximation using explicit feature maps have emerged as a substitute for traditional kernel based methods. Over the years, various advancements from the theoretical perspective have been made to explicit kernel maps, especially to the method of random Fourier features (RFF), which is the main focus of our work. In this work, the major developments in the theory of kernel function approximation are reviewed in a systematic manner and the practical applications are discussed. Furthermore, we identify the shortcomings of the current research, and discuss possible avenues for future work.

Journal Article

Share this book

Add to My Shelf

Multi-scale feature balance enhancement network for pedestrian detection

by Yu, Haigang , He, Ning , He, Yuzhe in Ablation , Accuracy , Algorithms

2022

Pedestrian detection uses computer vision technology to detect and locate pedestrians in an image or video sequence. Many traditional and deep learning methods have been developed but scale differences between pedestrians hinder accurate pedestrian detection and localization. This paper proposes the multi-scale feature balance enhancement network to address the problems of low-level feature information loss and feature map level imbalance when using the feature pyramid network (FPN) for multi-scale pedestrian detection. It integrates three components: path expansion, feature balance module, and feature enhancement module. These factors enrich the low-level feature information, balance the feature maps, and enhance feature extraction capabilities. Experimental results show that the proposed network can significantly improve the performance of multi-scale pedestrian detection. For the MS COCO data set, the proposed network is embedded into the current mainstream detectors and demonstrates improved pedestrian detection accuracy. Each component of the network is also verified separately. On the Cityscapes data set, the proposed method improves the pedestrian detection accuracy for large, medium, and small scales compared with FPN faster R-CNN. On the WiderPerson data set, the proposed method integrated into IterDet achieves a 90.89 average precision and 42.31 mean miss rate, which reaches the optimal level. At the same time, it achieves a 87.27 average precision and 48.39 mean miss rate on the CrowdHuman data set.

Journal Article

Share this book

Add to My Shelf

Reliability-aware label distribution learning with attention-rectified for facial expression recognition

by Peng, Liyuan , Liu, Yanbing , Wei, Yuyun in Arousal , Attention , Computer vision

2025

Facial expression recognition poses a significant challenge in computer vision with numerous applications. However, existing FER methods need more generalization ability and better robustness when dealing with complex datasets with noisy labels. We propose a label distribution learning model, RA-ARNet, with novel reliability-aware (RA) and attention-rectified (AR) modules to handle noisy labels. Specifically, the RA module evaluates the reliability of the image’ neighboring instances in the valence-arousal space and constructs corresponding label distribution based on the evaluation as auxiliary supervision information to enhance the model’s robustness and generalization on various FER datasets with noisy labels. The AR module can gradually improve the model’s ability to extract attention features of facial landmarks by introducing consistency detection of attention feature maps of images and landmarks in training, thereby improving the model’s FER accuracy. The competitive experimental results on public datasets validate the effectiveness of the proposed method and compare it with the current state-of-the-art methods. The experimental results indicate that the classification performance of RA-ARNet reaches 91.36% on RAF-DB and 61.47% on AffectNet (8 cls) and shows potential to deal with images with occlusion.

Journal Article

Share this book

Add to My Shelf

Towards compressed and efficient CNN architectures via pruning

in Accuracy , Artificial neural networks , Computer vision

2024

Convolutional Neural Networks (CNNs) use convolutional kernels to extract important low-level to high-level features from data. The performance of CNNs improves as they grow deep thereby learning better representations of the data. However, such deep CNNs are compute and memory-intensive, making deployment on resource-constrained devices challenging. To address this, the CNNs are compressed by adopting pruning strategies that remove redundant convolutional kernels from each layer while maintaining accuracy. Existing pruning methods that are based on feature map importance, only prune the convolutional layers uniformly and do not consider fully connected layers. Also, current techniques do not take into account class labels while pruning the less important feature maps and do not explore the need for retraining after pruning. This paper presents pruning techniques to prune convolutional and fully connected layers. This paper proposes a novel class-specific pruning strategy based on finding feature map importance in terms of entropy for convolutional layers and the number of incoming zeros to neurons for fully connected layers. The class-specific approach helps to have a different pruning threshold for every convolutional layer and ensures that the pruning threshold is not influenced by any particular class. A study on the need for retraining the entire network or a part of the network after pruning is also carried out. For Intel image, CIFAR10 and CIFAR100 datasets the proposed pruning method has compressed AlexNet by 83.2%, 87.19%, and 79.7%, VGG-16 by 83.7%, 85.11%, and 84.06% and ResNet-50 by 62.99%, 62.3% and 58.34% respectively.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter