Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
248 result(s) for "domain generalization"
Sort by:
Multi-Domain Feature Alignment for Face Anti-Spoofing
Face anti-spoofing is critical for enhancing the robustness of face recognition systems against presentation attacks. Existing methods predominantly rely on binary classification tasks. Recently, methods based on domain generalization have yielded promising results. However, due to distribution discrepancies between various domains, the differences in the feature space related to the domain considerably hinder the generalization of features from unfamiliar domains. In this work, we propose a multi-domain feature alignment framework (MADG) that addresses poor generalization when multiple source domains are distributed in the scattered feature space. Specifically, an adversarial learning process is designed to narrow the differences between domains, achieving the effect of aligning the features of multiple sources, thus resulting in multi-domain alignment. Moreover, to further improve the effectiveness of our proposed framework, we incorporate multi-directional triplet loss to achieve a higher degree of separation in the feature space between fake and real faces. To evaluate the performance of our method, we conducted extensive experiments on several public datasets. The results demonstrate that our proposed approach outperforms current state-of-the-art methods, thereby validating its effectiveness in face anti-spoofing.
Domain adversarial neural networks for domain generalization: when it works and how to improve
Theoretically, domain adaptation is a well-researched problem. Further, this theory has been well-used in practice. In particular, we note the bound on target error given by Ben-David et al. (Mach Learn 79(1–2):151–175, 2010) and the well-known domain-aligning algorithm based on this work using Domain Adversarial Neural Networks (DANN) presented by Ganin and Lempitsky (in International conference on machine learning, pp 1180–1189). Recently, multiple variants of DANN have been proposed for the related problem of domain generalization , but without much discussion of the original motivating bound. In this paper, we investigate the validity of DANN in domain generalization from this perspective. We investigate conditions under which application of DANN makes sense and further consider DANN as a dynamic process during training. Our investigation suggests that the application of DANN to domain generalization may not be as straightforward as it seems. To address this, we design an algorithmic extension to DANN in the domain generalization case. Our experimentation validates both theory and algorithm.
Domain generalization through meta-learning: a survey
Deep neural networks (DNNs) have revolutionized artificial intelligence but often lack performance when faced with out-of-distribution data, a common scenario due to the inevitable domain shifts in real-world applications. This limitation stems from the common assumption that training and testing data share the same distribution-an assumption frequently violated in practice. Despite their effectiveness with large amounts of data and computational power, DNNs struggle with distributional shifts and limited labeled data, leading to overfitting and poor generalization across various tasks and domains. Meta-learning presents a promising approach by employing algorithms that acquire transferable knowledge across various tasks for fast adaptation, eliminating the need to learn each task from scratch. This survey paper delves into the realm of meta-learning with a focus on its contribution to domain generalization. We first clarify the concept of meta-learning for domain generalization and introduce a novel taxonomy based on the feature extraction strategy and the classifier learning methodology, offering a granular view of methodologies. Additionally, we present a decision graph to assist readers in navigating the taxonomy based on data availability and domain shifts, enabling them to select and develop a proper model tailored to their specific problem requirements. Through an exhaustive review of existing methods and underlying theories, we map out the fundamentals of the field. Our survey provides practical insights and an informed discussion on promising research directions.
ProgPrompt: program generation for situated robot task planning using large language models
Task planning can require defining myriad domain knowledge about the world in which a robot needs to act. To ameliorate that effort, large language models (LLMs) can be used to score potential next actions during task planning, and even generate action sequences directly, given an instruction in natural language with no additional domain information. However, such methods either require enumerating all possible next steps for scoring, or generate free-form text that may contain actions not possible on a given robot in its current context. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments, robot capabilities, and tasks. Our key insight is to prompt the LLM with program-like specifications of the available actions and objects in an environment, as well as with example programs that can be executed. We make concrete recommendations about prompt structure and generation constraints through ablation experiments, demonstrate state of the art success rates in VirtualHome household tasks, and deploy our method on a physical robot arm for tabletop tasks. Website and code at progprompt.github.io
Cross-Domain Gated Learning for Domain Generalization
Domain generalization aims to improve the generalization capacity of a model by leveraging useful information from the multi-domain data. However, learning an effective feature representation from such multi-domain data is challenging, due to the domain shift problem. In this paper, we propose an information gating strategy, termed cross-domain gating (CDG), to address this problem. Specifically, we try to distill the domain-invariant feature by adaptively muting the domain-related activations in the feature maps. This feature distillation process prevents the network from overfitting to the domain-related detailed information, and thereby improves the generalization ability of learned feature representation. Extensive experiments are conducted on three public datasets. The experimental results show that the proposed CDG training strategy can excellently enforce the network to exploit the intrinsic features of objects from the multi-domain data, and achieve a new state-of-the-art domain generalization performance on these benchmarks.
Domain generalization for semantic segmentation: a survey
Deep neural networks (DNNs) have proven explicit contributions in making autonomous driving cars and related tasks such as semantic segmentation, motion tracking, object detection, sensor fusion, and planning. However, in challenging situations, DNNs are not generalizable because of the inherent domain shift due to the nature of training under the i.i.d. assumption. The goal of semantic segmentation is to preserve information from a given image into multiple meaningful categories for visual understanding. Particularly for semantic segmentation, pixel-wise annotation is extremely costly and not always feasible. Domain generalization for semantic segmentation aims to learn pixel-level semantic labels from multiple source domains and generalize to predict pixel-level semantic labels on multiple unseen target domains. In this survey, for the first time, we present a comprehensive review of DG for semantic segmentation. we present a comprehensive summary of recent works related to domain generalization in semantic segmentation, which establishes the importance of generalizing to new environments of segmentation models. Although domain adaptation has gained more attention in segmentation tasks than domain generalization, it is still worth unveiling new trends that are adopted from domain generalization methods in semantic segmentation. We cover most of the recent and dominant DG methods in the context of semantic segmentation and also provide some other related applications. We conclude this survey by highlighting the future directions in this area.
Towards Generalized UAV Object Detection: A Novel Perspective from Frequency Domain Disentanglement
When deploying unmanned aerial vehicle (UAV) object detection networks to complex, real-world scenes, generalization ability is often reduced due to domain shift. While most existing domain-generalized object detection methods disentangle domain-invariant features spatially, our exploratory experiments revealed a key insight for UAV object detection (UAV-OD): frequency domain contributions exhibit more pronounced disparities in generalization compared to generic object detection involving larger objects, since UAV-OD detects smaller objects. Therefore, frequency domain disentanglement stands out as a more direct, effective approach for UAV-OD. This paper proposes a novel frequency domain disentanglement method to improve UAV-OD generalization. Specifically, our framework leverages two learnable filters extracting domain-invariant and domain-specific spectra. Additionally, we design two contrastive losses: an image-level loss and an instance-level loss guiding training. These losses enable the filters to focus on extracting domain-invariant and domain-specific spectra, achieving better disentangling. Extensive experiments across multiple datasets, including UAVDT and Visdrone2019-DET, utilizing Faster R-CNN and YOLOv5, show our approach consistently and significantly outperforms baseline and state-of-the-art domain generalization methods. Our code is available at https://github.com/wangkunyu241/UAV-Frequency.
Bridging the Source-to-Target Gap for Cross-Domain Person Re-identification with Intermediate Domains
Cross-domain person re-identification (re-ID), such as unsupervised domain adaptive re-ID (UDA re-ID), aims to transfer the identity-discriminative knowledge from the source to the target domain. Existing methods commonly consider the source and target domains are isolated from each other, i.e., no intermediate status is modeled between the source and target domains. Directly transferring the knowledge between two isolated domains can be very difficult, especially when the domain gap is large. This paper, from a novel perspective, assumes these two domains are not completely isolated, but can be connected through a series of intermediate domains. Instead of directly aligning the source and target domains against each other, we propose to align the source and target domains against their intermediate domains so as to facilitate a smooth knowledge transfer. To discover and utilize these intermediate domains, this paper proposes an Intermediate Domain Module (IDM) and a Mirrors Generation Module (MGM). IDM has two functions: (1) it generates multiple intermediate domains by mixing the hidden-layer features from source and target domains and (2) it dynamically reduces the domain gap between the source/target domain features and the intermediate domain features. While IDM achieves good domain alignment effect, it introduces a side effect, i.e., the mix-up operation may mix the identities into a new identity and lose the original identities. Accordingly, MGM is introduced to compensate the loss of the original identity by mapping the features into the IDM-generated intermediate domains without changing their original identity. It allows to focus on minimizing domain variations to further promote the alignment between the source/target domain and intermediate domains, which reinforces IDM into IDM++. We extensively evaluate our method under both the UDA and domain generalization (DG) scenarios and observe that IDM++ yields consistent (and usually significant) performance improvement for cross-domain re-ID, achieving new state of the art. For example, on the challenging MSMT17 benchmark, IDM++ surpasses the prior state of the art by a large margin (e.g., up to 9.9% and 7.8% rank-1 accuracy) for UDA and DG scenarios, respectively. Code is available at https://github.com/SikaStar/IDM .
DLOW: Domain Flow and Applications
In this work, we present a domain flow generation (DLOW) model to bridge two different domains by generating a continuous sequence of intermediate domains flowing from one domain to the other. The benefits of our DLOW model are twofold. First, it is able to transfer source images into a domain flow, which consists of images with smoothly changing distributions from the source to the target domain. The domain flow bridges the gap between source and target domains, thus easing the domain adaptation task. Second, when multiple target domains are provided for training, our DLOW model is also able to generate new styles of images that are unseen in the training data. The new images are shown to be able to mimic different artists to produce a natural blend of multiple art styles. Furthermore, for the semantic segmentation in the adverse weather condition, we take advantage of our DLOW model to generate images with gradually changing fog density, which can be readily used for boosting the segmentation performance when combined with a curriculum learning strategy. We demonstrate the effectiveness of our model on benchmark datasets for different applications, including cross-domain semantic segmentation, style generalization, and foggy scene understanding. Our implementation is available at https://github.com/ETHRuiGong/DLOW.
Comparing Handcrafted Features and Deep Neural Representations for Domain Generalization in Human Activity Recognition
Human Activity Recognition (HAR) has been studied extensively, yet current approaches are not capable of generalizing across different domains (i.e., subjects, devices, or datasets) with acceptable performance. This lack of generalization hinders the applicability of these models in real-world environments. As deep neural networks are becoming increasingly popular in recent work, there is a need for an explicit comparison between handcrafted and deep representations in Out-of-Distribution (OOD) settings. This paper compares both approaches in multiple domains using homogenized public datasets. First, we compare several metrics to validate three different OOD settings. In our main experiments, we then verify that even though deep learning initially outperforms models with handcrafted features, the situation is reversed as the distance from the training distribution increases. These findings support the hypothesis that handcrafted features may generalize better across specific domains.