Catalogue Search | MBRL

HCLR-Net: Hybrid Contrastive Learning Regularization with Locally Randomized Perturbation for Underwater Image Enhancement

by Zhang, Weishi , Li, Chongyi , Lam, Kin-Man in Adaptive sampling , Electromagnetic absorption , Image degradation

2024

Underwater image enhancement presents a significant challenge due to the complex and diverse underwater environments that result in severe degradation phenomena such as light absorption, scattering, and color distortion. More importantly, obtaining paired training data for these scenarios is a challenging task, which further hinders the generalization performance of enhancement models. To address these issues, we propose a novel approach, the Hybrid Contrastive Learning Regularization (HCLR-Net). Our method is built upon a distinctive hybrid contrastive learning regularization strategy that incorporates a unique methodology for constructing negative samples. This approach enables the network to develop a more robust sample distribution. Notably, we utilize non-paired data for both positive and negative samples, with negative samples are innovatively reconstructed using local patch perturbations. This strategy overcomes the constraints of relying solely on paired data, boosting the model’s potential for generalization. The HCLR-Net also incorporates an Adaptive Hybrid Attention module and a Detail Repair Branch for effective feature extraction and texture detail restoration, respectively. Comprehensive experiments demonstrate the superiority of our method, which shows substantial improvements over several state-of-the-art methods in terms of quantitative metrics, significantly enhances the visual quality of underwater images, establishing its innovative and practical applicability. Our code is available at: https://github.com/zhoujingchun03/HCLR-Net.

Journal Article

Share this book

Add to My Shelf

Explicit access to detailed representations of feature distributions

by Kristjánsson, Árni , Ásgeirsson, Árni Gunnar , Khvostov, Vladislav in Adult , Behavioral Science and Psychology , Brief Report

2025

The human visual system can quickly process groups of objects (ensembles) and build compressed representations of their features. What does the conscious perception of ensembles consist of? Observersʼ explicit access to ensemble representations has been considered very limited – any distributional aspects beyond simple summary statistics, such as the mean or variance, cannot be explicitly accessed. In contrast, we demonstrate that the visual system can represent ensemble distributions in detail, and observers have reliable explicit access to these representations. In our new paradigm ( Feature Frequency Report ), observers viewed 36 disks of various colors for 800 ms and then reported the frequency of a randomly chosen color using a slider. The sets had Gaussian, uniform, or bimodal color distributions with a random mean color. The distributions of responses – both aggregated and separate for each observer – followed the shape of the presented distribution. Modeling revealed that performance reflected integrated information from the whole set rather than sub-sampling. After only brief exposure to a color set , the visual system can build detailed representations of feature distributions that observers have explicit access to. This result necessitates a fundamental rethinking of how ensembles are processed. We suggest that such distribution representations are the most natural way for the visual system to represent groups of objects. Explicit feature distribution representations may contribute to people ‘s impression of having a rich perceptual experience despite severe attentional and working memory limitations.

Journal Article

Share this book

Add to My Shelf

Strong Generalized Speech Emotion Recognition Based on Effective Data Augmentation

by Hu, Ziyi , Zhu, Chunhua , Shan, Shuai in Acoustics , Algorithms , Customer services

2022

The absence of labeled samples limits the development of speech emotion recognition (SER). Data augmentation is an effective way to address sample sparsity. However, there is a lack of research on data augmentation algorithms in the field of SER. In this paper, the effectiveness of classical acoustic data augmentation methods in SER is analyzed, based on which a strong generalized speech emotion recognition model based on effective data augmentation is proposed. The model uses a multi-channel feature extractor consisting of multiple sub-networks to extract emotional representations. Different kinds of augmented data that can effectively improve SER performance are fed into the sub-networks, and the emotional representations are obtained by the weighted fusion of the output feature maps of each sub-network. And in order to make the model robust to unseen speakers, we employ adversarial training to generalize emotion representations. A discriminator is used to estimate the Wasserstein distance between the feature distributions of different speakers and to force the feature extractor to learn the speaker-invariant emotional representations by adversarial training. The simulation experimental results on the IEMOCAP corpus show that the performance of the proposed method is 2–9% ahead of the related SER algorithm, which proves the effectiveness of the proposed method.

Journal Article

Share this book

Add to My Shelf

Sea-sky line detection in the infrared image based on the vertical grayscale distribution feature

by Pei, Jihong , Mo, Wenying in Algorithms , Artificial Intelligence , Artificial neural networks

2023

When detecting sea-sky line (SSL) in the infrared image, the blurry SSL, conspicuous sea clutter affects the accurate detection of SSL seriously. To solve these problems, we proposed a robust SSL detection algorithm based on the vertical grayscale distribution feature (VGDF). We divided the infrared image into sub-image blocks by sliding window. The sub-image blocks that contain SSL in the central area are labeled as positive samples, and those without any SSL are labeled as negative samples. To improve the separability of the samples, the vertical grayscale distribution feature map (VGDF map) transformation method is proposed to transform the gray sub-image blocks into the feature maps. The VGDF maps are used as the input of the convolutional neural network to train the SSL recognition model. This strategy can improve the separability of SSL image blocks from background image blocks. Then, we use the trained model to obtain the edge candidates and construct the SSL probability feature map. Finally, we detect the SSL by fitting a straight line with the greatest probability on the SSL probability feature map. The proposed algorithm realized 99.4% accuracy rate on the dataset containing 1320 frames of infrared images. The comparison results showed that our algorithm obtained higher detection accuracy than the existing state-of-the-art algorithms. Our algorithm performs well even when the SSL was blurred or there are obvious ship’s wave wakes on the sea surface.

Journal Article

Share this book

Add to My Shelf

Optimizing perception: Attended and ignored stimuli create opposing perceptual biases

by Kristjánsson, Árni , Hansmann-Roth, Sabrina , Chetverikov, Andrey in Anatomy , Behavioral Science and Psychology , Bias

2021

Humans have remarkable abilities to construct a stable visual world from continuously changing input. There is increasing evidence that momentary visual input blends with previous input to preserve perceptual continuity. Most studies have shown that such influences can be traced to characteristics of the attended object at a given moment. Little is known about the role of ignored stimuli in creating this continuity. This is important since while some input is selected for processing, other input must be actively ignored for efficient selection of the task-relevant stimuli. We asked whether attended targets and actively ignored distractor stimuli in an odd-one-out search task would bias observers’ perception differently. Our observers searched for an oddly oriented line among distractors and were occasionally asked to report the orientation of the last visual search target they saw in an adjustment task. Our results show that at least two opposite biases from past stimuli influence current perception: A positive bias caused by serial dependence pulls perception of the target toward the previous target features, while a negative bias induced by the to-be-ignored distractor features pushes perception of the target away from the distractor distribution. Our results suggest that to-be-ignored items produce a perceptual bias that acts in parallel with other biases induced by attended items to optimize perception. Our results are the first to demonstrate how actively ignored information facilitates continuity in visual perception.

Journal Article

Share this book

Add to My Shelf

Semi-Supervised Subcategory Centroid Alignment-Based Scene Classification for High-Resolution Remote Sensing Images

by Mo, Nan , Zhu, Ruixi in Adaptation , Algorithms , Alignment

2024

It is usually hard to obtain adequate annotated data for delivering satisfactory scene classification results. Semi-supervised scene classification approaches can transfer the knowledge learned from previously annotated data to remote sensing images with scarce samples for satisfactory classification results. However, due to the differences between sensors, environments, seasons, and geographical locations, cross-domain remote sensing images exhibit feature distribution deviations. Therefore, semi-supervised scene classification methods may not achieve satisfactory classification accuracy. To address this problem, a novel semi-supervised subcategory centroid alignment (SSCA)-based scene classification approach is proposed. The SSCA framework is made up of two components, namely the rotation-robust convolutional feature extractor (RCFE) and the neighbor-based subcategory centroid alignment (NSCA). The RCFE aims to suppress the impact of rotation changes on remote sensing image representation, while the NSCA aims to decrease the impact of intra-category variety across domains on cross-domain scene classification. The SSCA algorithm and several competitive approaches are validated on two datasets to demonstrate its effectiveness. The results prove that the proposed SSCA approach performs better than most competitive approaches by no less than 2% overall accuracy.

Journal Article

Share this book

Add to My Shelf

A Vehicle Recognition Algorithm Based on Deep Transfer Learning with a Multiple Feature Subspace Distribution

by Chen, Long , Wang, Hai , Cai, Yingfeng in deep transfer learning , intelligent vehicles , multiple subspace feature distribution

2018

Vehicle detection is a key component of environmental sensing systems for Intelligent Vehicles (IVs). The traditional shallow model and offline learning-based vehicle detection method are not able to satisfy the real-world challenges of environmental complexity and scene dynamics. Focusing on these problems, this work proposes a vehicle detection algorithm based on a multiple feature subspace distribution deep model with online transfer learning. Based on the multiple feature subspace distribution hypothesis, a deep model is established in which multiple Restricted Boltzmann Machines (RBMs) construct the lower layers and a Deep Belief Network (DBN) composes the superstructure. For this deep model, an unsupervised feature extraction method is applied, which is based on sparse constraints. Then, a transfer learning method with online sample generation is proposed based on the deep model. Finally, the entire classifier is retrained online with supervised learning. The experiment is actuated using the KITTI road image datasets. The performance of the proposed method is compared with many state-of-the-art methods and it is demonstrated that the proposed deep transfer learning-based algorithm outperformed existing state-of-the-art methods.

Journal Article

Share this book

Add to My Shelf

Robust Radar Emitter Recognition Based on the Three-Dimensional Distribution Feature and Transfer Learning

by Yang, Zhutian , Qiu, Wei , Sun, Hongjian in Emittance , Feature recognition , Learning

2016

Due to the increasing complexity of electromagnetic signals, there exists a significant challenge for radar emitter signal recognition. To address this challenge, multi-component radar emitter recognition under a complicated noise environment is studied in this paper. A novel radar emitter recognition approach based on the three-dimensional distribution feature and transfer learning is proposed. The cubic feature for the time-frequency-energy distribution is proposed to describe the intra-pulse modulation information of radar emitters. Furthermore, the feature is reconstructed by using transfer learning in order to obtain the robust feature against signal noise rate (SNR) variation. Last, but not the least, the relevance vector machine is used to classify radar emitter signals. Simulations demonstrate that the approach proposed in this paper has better performances in accuracy and robustness than existing approaches.

Journal Article

Share this book

Add to My Shelf

TACDFSL: Task Adaptive Cross Domain Few-Shot Learning

by Zhang, Qi , Jiang, Yingluo , Wen, Zhijie in Adaptation , Algorithms , Classification

2022

Cross Domain Few-Shot Learning (CDFSL) has attracted the attention of many scholars since it is closer to reality. The domain shift between the source domain and the target domain is a crucial problem for CDFSL. The essence of domain shift is the marginal distribution difference between two domains which is implicit and unknown. So the empirical marginal distribution measurement is proposed, that is, WDMDS (Wasserstein Distance for Measuring Domain Shift) and MMDMDS (Maximum Mean Discrepancy for Measuring Domain Shift). Besides this, pre-training a feature extractor and fine-tuning a classifier are used in order to have a good generalization in CDFSL. Since the feature obtained by the feature extractor is high-dimensional and left-biased, the adaptive feature distribution transformation is proposed, to make the feature distribution of each sample be approximately Gaussian distribution. This approximate symmetric distribution improves image classification accuracy by 3% on average. In addition, the applicability of different classifiers for CDFSL is investigated, and the classification model should be selected based on the empirical marginal distribution difference between the two domains. The Task Adaptive Cross Domain Few-Shot Learning (TACDFSL) is proposed based on the above ideas. TACDFSL improves image classification accuracy by 3–9%.

Journal Article

Share this book

Add to My Shelf

Cross-Corpus Speech Emotion Recognition Based on Multi-Task Learning and Subdomain Adaptation

by Wang, Yang , Duan, Wenzhuo , Huang, Chen in Acoustics , Adaptation , Adaptive algorithms

2023

To solve the problem of feature distribution discrepancy in cross-corpus speech emotion recognition tasks, this paper proposed an emotion recognition model based on multi-task learning and subdomain adaptation, which alleviates the impact on emotion recognition. Existing methods have shortcomings in speech feature representation and cross-corpus feature distribution alignment. The proposed model uses a deep denoising auto-encoder as a shared feature extraction network for multi-task learning, and the fully connected layer and softmax layer are added before each recognition task as task-specific layers. Subsequently, the subdomain adaptation algorithm of emotion and gender features is added to the shared network to obtain the shared emotion features and gender features of the source domain and target domain, respectively. Multi-task learning effectively enhances the representation ability of features, a subdomain adaptive algorithm promotes the migrating ability of features and effectively alleviates the impact of feature distribution differences in emotional features. The average results of six cross-corpus speech emotion recognition experiments show that, compared with other models, the weighted average recall rate is increased by 1.89~10.07%, the experimental results verify the validity of the proposed model.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter