Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
1,766
result(s) for
"Zero-shot learning"
Sort by:
Dual insurance for generalized zero-shot learning
by
Fang, Xiaozhao
,
Kang, Peipei
,
Li, Chuang
in
Artificial Intelligence
,
Classification
,
Clustering
2025
Traditional zero-shot learning aims to use the trained model to accurately classify samples from unseen classes, while for the more difficult task of generalized zero-shot learning, the trained model needs to classify samples from both seen and unseen classes into the correct classes. Because only seen class samples are available during training, generalized zero-shot learning meets great challenges in classification. Generative model is one of the good methods to solve this problem. However, the samples generated by the generative model are often of poor quality. In addition, there are semantic redundancies in the generated samples that are not conducive to classification. To solve these problems, we proposed the dual insurance model (DI-GAN) for generalized zero-shot learning in this paper, including a feature generation module and a semantic separation module. They guarantee the high quality of generated features and the good classification performance respectively. Specifically, the first insurance is based on generative adversarial network, whose generator is constrained by a clustering method to make the generated samples close to the real samples. The second insurance is based on variational autoencoder, including semantic separation, instance network and classification network. Semantic separation is designed to extract the semantically related parts which are beneficial to classification, while instance network acting on the semantically related parts is used to ensure the classification performance. Extensive experiments on four benchmark datasets show the competitiveness of the proposed DI-GAN.
Journal Article
Zero-shot learning via visual-semantic aligned autoencoder
2023
Zero-shot learning recognizes the unseen samples via the model learned from the seen class samples and semantic features. Due to the lack of information of unseen class samples in the training set, some researchers have proposed the method of generating unseen class samples by using generative models. However, the generated model is trained with the training set samples first, and then the unseen class samples are generated, which results in the features of the unseen class samples tending to be biased toward the seen class and may produce large deviations from the real unseen class samples. To tackle this problem, we use the autoencoder method to generate the unseen class samples and combine the semantic features of the unseen classes with the proposed new sample features to construct the loss function. The proposed method is validated on three datasets and showed good results.
Journal Article
Classifier and Exemplar Synthesis for Zero-Shot Learning
by
Wei-Lun, Chao
,
Gong Boqing
,
Soravit, Changpinyo
in
Benchmarks
,
Classifiers
,
Empirical analysis
2020
Zero-shot learning (ZSL) enables solving a task without the need to see its examples. In this paper, we propose two ZSL frameworks that learn to synthesize parameters for novel unseen classes. First, we propose to cast the problem of ZSL as learning manifold embeddings from graphs composed of object classes, leading to a flexible approach that synthesizes “classifiers” for the unseen classes. Then, we define an auxiliary task of synthesizing “exemplars” for the unseen classes to be used as an automatic denoising mechanism for any existing ZSL approaches or as an effective ZSL model by itself. On five visual recognition benchmark datasets, we demonstrate the superior performances of our proposed frameworks in various scenarios of both conventional and generalized ZSL. Finally, we provide valuable insights through a series of empirical analyses, among which are a comparison of semantic representations on the full ImageNet benchmark as well as a comparison of metrics used in generalized ZSL. Our code and data are publicly available at https://github.com/pujols/Zero-shot-learning-journal.
Journal Article
Semantic Contrastive Embedding for Generalized Zero-Shot Learning
2022
Generalized zero-shot learning (GZSL) aims to recognize objects from both seen and unseen classes when only the labeled examples from seen classes are provided. Recent feature generation methods learn a generative model that can synthesize the missing visual features of unseen classes to mitigate the data-imbalance problem in GZSL. However, the original visual feature space is suboptimal for GZSL recognition since it lacks semantic information, which is vital for recognizing the unseen classes. To tackle this issue, we propose to integrate the feature generation model with an embedding model. Our GZSL framework maps both the real and the synthetic samples produced by the generation model into an embedding space, where we perform the final GZSL classification. Specifically, we propose a semantic contrastive embedding (SCE) for our GZSL framework. Our SCE consists of attribute-level contrastive embedding and class-level contrastive embedding. They aim to obtain the transferable and discriminative information, respectively, in the embedding space. We evaluate our GZSL method with semantic contrastive embedding, named SCE-GZSL, on four benchmark datasets. The results show that our SCE-GZSL method can achieve the state-of-the-art or the second-best on these datasets.
Journal Article
Semantics-Guided Intra-Category Knowledge Transfer for Generalized Zero-Shot Learning
by
Wang, Yu-Chiang Frank
,
Lee, Yuan-Hao
,
Lin, Chia-Ching
in
Datasets
,
Deep learning
,
Hallucinations
2023
Zero-shot learning (ZSL) requires one to associate visual and semantic information observed from data of seen classes, so that test data of unseen classes can be recognized based on the described semantic representation. Aiming at synthesizing visual data from the given semantic inputs, hallucination-based ZSL approaches might suffer from mode collapse and biased problems due to the lack of ability in modeling the desirable visual features for unseen categories. In this paper, we present a generative model of Cross-Modal Consistency GAN (CMC-GAN), which performs semantics-guided intra-category knowledge transfer across image categories, so that data hallucination for unseen classes can be achieved with proper semantics and sufficient visual diversity. In our experiments, we perform standard and generalized ZSL on four benchmark datasets, confirming the effectiveness of our approach over that of state-of-the-art ZSL methods.
Journal Article
Generalized zero-shot emotion recognition from body gestures
2022
In human-human interaction, body language is one of the most important emotional expressions. However, each emotion category contains abundant emotional body gestures, and basic emotions used in most researches are difficult to describe complex and diverse emotional states. It is costly to collect sufficient samples of all emotional expressions, and new emotions or new body gestures that are not included in the training set may appear during testing. To address the above problems, we design a novel mechanism that treats each emotion category as a collection of multiple body gesture categories to make better use of gesture information for emotion recognition. A Generalized Zero-Shot Learning (GZSL) framework is introduced to recognize both seen and unseen body gesture categories with the help of semantic information, and emotion predictions are further provided based on the relationship between gestures and emotions. This framework consists of two branches. The first branch is a Hierarchical Prototype Network (HPN) which learns the prototypes of body gestures and uses them to calculate the emotion attentive prototypes. This branch aims to obtain predictions on samples of the seen gesture categories. The second branch is a Semantic Auto-Encoder (SAE) which utilizes semantic representations to predict samples of unseen gesture categories. Thresholds are further trained to determine which branch result will be used during testing, and the emotion labels are finally obtained from these results. Comprehensive experiments are conducted on an emotion recognition dataset which contains skeleton data of multiple body gestures, and the performance of our framework is superior to both the traditional emotion classifier and state-of-the-art zero-shot learning methods.
Journal Article
Zero-Shot Image Classification Based on a Learnable Deep Metric
2021
The supervised model based on deep learning has made great achievements in the field of image classification after training with a large number of labeled samples. However, there are many categories without or only with a few labeled training samples in practice, and some categories even have no training samples at all. The proposed zero-shot learning greatly reduces the dependence on labeled training samples for image classification models. Nevertheless, there are limitations in learning the similarity of visual features and semantic features with a predefined fixed metric (e.g., as Euclidean distance), as well as the problem of semantic gap in the mapping process. To address these problems, a new zero-shot image classification method based on an end-to-end learnable deep metric is proposed in this paper. First, the common space embedding is adopted to map the visual features and semantic features into a common space. Second, an end-to-end learnable deep metric, that is, the relation network is utilized to learn the similarity of visual features and semantic features. Finally, the invisible images are classified, according to the similarity score. Extensive experiments are carried out on four datasets and the results indicate the effectiveness of the proposed method.
Journal Article
Dual-level contrastive learning network for generalized zero-shot learning
by
Wu, Jigang
,
Liu, Jigang
,
Guan, Jiaqi
in
Artificial Intelligence
,
Classification
,
Computer Graphics
2022
Generalized zero-shot learning (GZSL) aims to utilize semantic information to recognize the seen and unseen samples, where unseen classes are unavailable during training. Though recent advances have been made by incorporating contrastive learning into GZSL, existing approaches still suffer from two limitations: (1) without considering fine-grained cluster structures, these models cannot guarantee the discriminability and semantic awareness of synthetic features; (2) classifiers tend to overfit the seen classes, as they only concentrate on the seen domain. To address these challenges, we propose a Dual-level Contrastive Learning Network (DCLN), in which intra-domain and cross-domain contrastive learning are seamlessly integrated into a unified learning model. Specifically, the former performs center-prototype contrasting to fully explore the discriminative structure knowledge, while the latter is proposed to effectively alleviate the overfitting problem by utilizing the semantic relationships between the seen and unseen domain. Finally, the experimental results on four benchmark datasets demonstrate the superiority of our DCLN over the state-of-the-art methods.
Journal Article
Vision transformer-based generalized zero-shot learning with data criticizing
2025
Generalized Zero-Shot Learning (GZSL) aims to enable accurate testing and recognition of unseen classes by utilizing training data from seen classes and leveraging attribute knowledge. However, GZSL faces a challenge wherein the model, trained solely on seen class data, tends to be biased towards recognizing visual features of seen classes, resulting in poorer recognition performance for unseen classes. To address this issue, we propose an approach called
Vi
sion
T
ransformer-Based Generalized Zero-Shot Learning with
Da
ta
Cr
iticizing (ViT-DaCr). In order to obtain improved visual features, we thoroughly examine features extracted by Vision Transformer (ViT) with a new design. Additionally, we recognize that not all training data align with our model during the training process, leading the model to exhibit a bias towards recognizing visual features of seen classes and directly impacting visual feature recognition. Therefore, we propose a data critic mechanism that utilizes Adjusted Boxplot to filter out such data automatically during the training process. Extensive experiments demonstrate the advanced performance of our model on three challenging and popular datasets.
Journal Article
Learning semantic consistency for audio-visual zero-shot learning
by
Chen, Yuling
,
Ruan, Xiaoli
,
Zhang, Wei
in
Artificial Intelligence
,
Audio data
,
Computer Science
2025
Audio-visual zero-shot learning requires an understanding of the relationship between audio and visual information to determine unseen classes. Despite many efforts and significant progress in the field, many existing methods tend to focus on learning strong representations, neglecting the semantic consistency between audio and video as well as the inherent hierarchical structure of the data. To address these issues, we propose Learning Semantic Consistency for Audio-Visual Zero-shot Learning. Specifically, we employ an attention mechanism to enhance cross-modal information interactions, aiming to capture the semantic consistency between audio and visual data. Meanwhile, we introduce a hyperbolic space to model the hierarchical structure of the data itself. Moreover, the proposed approach includes a novel loss function that considers the relationships between input modalities, reducing the distance between features of different modalities. To evaluate the proposed method, we test it on three benchmark datasets
,
, and
. Extensive experimental results show that the proposed method achieves state-of-the-art performance on all three datasets. For example, on the
dataset, the harmonic mean is improved by 5.7%. Code and data available at
https://github.com/ybyangjing/LSC-AVZSL
.
Journal Article