Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
47 result(s) for "danger recognition"
Sort by:
The liver’s dilemma: sensing real danger in a sea of PAMPs: the (arterial) sinusoidal segment theory
The liver is susceptible to viruses and bacterial infections, tumors, and sterile tissue damage, but immunological danger recognition in the liver is highly unconventional. When analyzing innate and adaptive immunity in the organ, the valid concepts that guide danger recognition and immune response in the periphery should be put aside. In the liver, the vascular anatomy is a game changer, as about 80% of the blood that percolates the organ arrives from the hepatic portal vein, draining blood rich in molecules from the intestinal flora. This 24/7 exposure to high amounts of pathogen-associated molecular pattern (PAMPs) molecules results in hepatic immunological tolerance. In the liver, dendritic, Kupffer (KC), liver sinusoidal endothelial cells (LSECs), and even hepatocytes express PD-L1, a T lymphocyte downregulatory molecule. Most cells express Fas-L, IL-10, TGF-β, low levels of co-stimulatory molecules, lack of or have low levels of MHC-I and/or MHC-II expression. Moreover, other negative regulators such as CTLA-4, IDO-1, and prostaglandin E2 (PGE2) are regularly expressed. Then, how can real danger be discerned and recognized in this sea of PAMPs? This is an open question. Here, we hypothesize that conventional immunological danger recognition can occur in the liver but in specific and minor arterial sinusoidal segments,. Then, in the portal triad, where the hepatic artery ramificates into the stroma and carries arterial blood with no gut-derived PAMPs, there is no evolutive or environmental pressure for immunosuppressive pathways, and conventional immunological danger recognition could occur. Therefore, in arterial sinusoidal segments with no sea of PAMPs, the liver could recognize real danger and support innate and adaptive immunity.
Realization of Efficient Exploration by Self-Generating Evaluation Considering Curiosity and Fear Indices Based on Prediction Error
In this paper, we propose a reinforcement-learning method in which an agent generates intrinsic rewards by evaluating its environment based on sensor data. The evaluation is perormed from three perspectives, one of which concerns the predictability of sensor inputs. This perspective focuses on the magnitude of the prediction error: a large error suggests environment uncertainty and leads to a low evaluation. However, prediction errors can arise from two distinct causes that previous method fail to distinguish. The first is the intrinsic unpredictability of a situation, where high prediction error persists even after sufficient learning. Such states are likely to be dangerous and should be assigned low evaluations to discourage the agent from entering them. The second is insufficient learning of the predictor itself, where high prediction error simply reflects the need for further exploration. These states should be assigned high evaluations to encourage learning. To address these cases this study introduced a method that evaluates environmental states by jointly considering the degree of danger and the necessity of further predictor learning with the learning progress for each sensor input. Two complimentary evaluation indices were proposed curiosity, which promotes exploration of insufficiently learned states, and fear, which discourages entry into unpredictable or hazardous states. The effectiveness of the proposed method was demonstrated through simulation experiments in a two-dimensional discrete environment.
Self-generation of reward by logarithmic transformation of multiple sensor evaluations
Although the design of the reward function in reinforcement learning is important, it is difficult to design a system that can adapt to a variety of environments and tasks. Therefore, we propose a method to autonomously generate rewards from sensor values, enabling task- and environment-independent reward design. Under this approach, environmental hazards are recognized by evaluating sensor values. The evaluation used for learning is obtained by integrating all the sensor evaluations that indicate danger. Although prior studies have employed weighted averages to integrate sensor evaluations, this approach does not reflect the increased danger arising from a higher amount of more sensor evaluations indicating danger. Instead, we propose the integration of sensor evaluation using logarithmic transformation. Through a path learning experiment, the proposed method was evaluated by comparing its rewards to those gained from manual reward setting and prior approaches.
CLIP-guided Prototype Modulating for Few-shot Action Recognition
Learning from large-scale contrastive language-image pre-training like CLIP has shown remarkable success in a wide range of downstream tasks recently, but it is still under-explored on the challenging few-shot action recognition (FSAR) task. In this work, we aim to transfer the powerful multimodal knowledge of CLIP to alleviate the inaccurate prototype estimation issue due to data scarcity, which is a critical problem in low-shot regimes. To this end, we present a CLIP-guided prototype modulating framework called CLIP-FSAR, which consists of two key components: a video-text contrastive objective and a prototype modulation. Specifically, the former bridges the task discrepancy between CLIP and the few-shot video task by contrasting videos and corresponding class text descriptions. The latter leverages the transferable textual concepts from CLIP to adaptively refine visual prototypes with a temporal Transformer. By this means, CLIP-FSAR can take full advantage of the rich semantic priors in CLIP to obtain reliable prototypes and achieve accurate few-shot classification. Extensive experiments on five commonly used benchmarks demonstrate the effectiveness of our proposed method, and CLIP-FSAR significantly outperforms existing state-of-the-art methods under various settings. The source code and models are publicly available at https://github.com/alibaba-mmai-research/CLIP-FSAR.
The impact of face masks on emotion recognition performance and perception of threat
Facial emotion recognition is crucial for social interaction. However, in times of a global pandemic, where wearing a face mask covering mouth and nose is widely encouraged to prevent the spread of disease, successful emotion recognition may be challenging. In the current study, we investigated whether emotion recognition, assessed by a validated emotion recognition task, is impaired for faces wearing a mask compared to uncovered faces, in a sample of 790 participants between 18 and 89 years (condition mask vs . original ). In two more samples of 395 and 388 participants between 18 and 70 years, we assessed emotion recognition performance for faces that are occluded by something other than a mask, i.e., a bubble as well as only showing the upper part of the faces (condition half vs . bubble ). Additionally, perception of threat for faces with and without occlusion was assessed. We found impaired emotion recognition for faces wearing a mask compared to faces without mask, for all emotions tested (anger, fear, happiness, sadness, disgust, neutral). Further, we observed that perception of threat was altered for faces wearing a mask. Upon comparison of the different types of occlusion, we found that, for most emotions and especially for disgust, there seems to be an effect that can be ascribed to the face mask specifically, both for emotion recognition performance and perception of threat. Methodological constraints as well as the importance of wearing a mask despite temporarily compromised social interaction are discussed.
Context Autoencoder for Self-supervised Representation Learning
We present a novel masked image modeling (MIM) approach, context autoencoder (CAE), for self-supervised representation pretraining. We pretrain an encoder by making predictions in the encoded representation space. The pretraining tasks include two tasks: masked representation prediction—predict the representations for the masked patches, and masked patch reconstruction—reconstruct the masked patches. The network is an encoder–regressor–decoder architecture: the encoder takes the visible patches as input; the regressor predicts the representations of the masked patches, which are expected to be aligned with the representations computed from the encoder, using the representations of visible patches and the positions of visible and masked patches; the decoder reconstructs the masked patches from the predicted encoded representations. The CAE design encourages the separation of learning the encoder (representation) from completing the pertaining tasks: masked representation prediction and masked patch reconstruction tasks, and making predictions in the encoded representation space empirically shows the benefit to representation learning. We demonstrate the effectiveness of our CAE through superior transfer performance in downstream tasks: semantic segmentation, object detection and instance segmentation, and classification. The code will be available at https://github.com/Atten4Vis/CAE.
Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
Transferring knowledge from pre-trained deep models for downstream tasks, particularly with limited labeled samples, is a fundamental problem in computer vision research. Recent advances in large-scale, task-agnostic vision-language pre-trained models, which are learned with billions of samples, have shed new light on this problem. In this study, we investigate how to efficiently transfer aligned visual and textual knowledge for downstream visual recognition tasks. We first revisit the role of the linear classifier in the vanilla transfer learning framework, and then propose a new paradigm where the parameters of the classifier are initialized with semantic targets from the textual encoder and remain fixed during optimization. To provide a comparison, we also initialize the classifier with knowledge from various resources. In the empirical study, we demonstrate that our paradigm improves the performance and training speed of transfer learning tasks. With only minor modifications, our approach proves effective across 17 visual datasets that span three different data domains: image, video, and 3D point cloud.
Exploring Vision-Language Models for Imbalanced Learning
Vision-language models (VLMs) that use contrastive language-image pre-training have shown promising zero-shot classification performance. However, their performance on imbalanced dataset is relatively poor, where the distribution of classes in the training dataset is skewed, leading to poor performance in predicting minority classes. For instance, CLIP achieved only 5% accuracy on the iNaturalist18 dataset. We propose to add a lightweight decoder to VLMs to avoid out of memory problem caused by large number of classes and capture nuanced features for tail classes. Then, we explore improvements of VLMs using prompt tuning, fine-tuning, and incorporating imbalanced algorithms such as Focal Loss, Balanced SoftMax and Distribution Alignment. Experiments demonstrate that the performance of VLMs can be further boosted when used with decoder and imbalanced methods. Specifically, our improved VLMs significantly outperforms zero-shot classification by an average accuracy of 6.58%, 69.82%, and 6.17%, on ImageNet-LT, iNaturalist18, and Places-LT, respectively. We further analyze the influence of pre-training data size, backbones, and training cost. Our study highlights the significance of imbalanced learning algorithms in face of VLMs pre-trained by huge data. We release our code at https://github.com/Imbalance-VLM/Imbalance-VLM.
Danger-Sensing/Patten Recognition Receptors and Neuroinflammation in Alzheimer’s Disease
Fibrillar aggregates and soluble oligomers of both Amyloid-β peptides (Aβs) and hyperphosphorylated Tau proteins (p-Tau-es), as well as a chronic neuroinflammation are the main drivers causing progressive neuronal losses and dementia in Alzheimer’s disease (AD). However, the underlying pathogenetic mechanisms are still much disputed. Several endogenous neurotoxic ligands, including Aβs, and/or p-Tau-es activate innate immunity-related danger-sensing/pattern recognition receptors (PPRs) thereby advancing AD’s neuroinflammation and progression. The major PRR families involved include scavenger, Toll-like, NOD-like, AIM2-like, RIG-like, and CLEC-2 receptors, plus the calcium-sensing receptor (CaSR). This quite intricate picture stresses the need to identify the pathogenetically topmost Aβ-activated PRR, whose signaling would trigger AD’s three main drivers and their intra-brain spread. In theory, the candidate might belong to any PRR family. However, results of preclinical studies using in vitro nontumorigenic human cortical neurons and astrocytes and in vivo AD-model animals have started converging on the CaSR as the pathogenetically upmost PRR candidate. In fact, the CaSR binds both Ca2+ and Aβs and promotes the spread of both Ca2+ dyshomeostasis and AD’s three main drivers, causing a progressive neurons’ death. Since CaSR’s negative allosteric modulators block all these effects, CaSR’s candidacy for topmost pathogenetic PRR has assumed a growing therapeutic potential worth clinical testing.
Language-Aware Soft Prompting: Text-to-Text Optimization for Few- and Zero-Shot Adaptation of V &L Models
Soft prompt learning has emerged as a promising direction for adapting V &L models to a downstream task using a few training examples. However, current methods significantly overfit the training data suffering from large accuracy degradation when tested on unseen classes from the same domain. In addition, all prior methods operate exclusively under the assumption that both vision and language data is present. To this end, we make the following 5 contributions: (1) To alleviate base class overfitting, we propose a novel Language-Aware Soft Prompting (LASP) learning method by means of a text-to-text cross-entropy loss that maximizes the probability of the learned prompts to be correctly classified with respect to pre-defined hand-crafted textual prompts. (2) To increase the representation capacity of the prompts, we also propose grouped LASP where each group of prompts is optimized with respect to a separate subset of textual prompts. (3) Moreover, we identify a visual-language misalignment introduced by prompt learning and LASP, and more importantly, propose a re-calibration mechanism to address it. (4) Importantly, we show that LASP is inherently amenable to including, during training, virtual classes, i.e. class names for which no visual samples are available, further increasing the robustness of the learned prompts. Expanding for the first time the setting to language-only adaptation, (5) we present a novel zero-shot variant of LASP where no visual samples at all are available for the downstream task. Through evaluations on 11 datasets, we show that our approach (a) significantly outperforms all prior works on soft prompting, and (b) matches and surpasses, for the first time, the accuracy on novel classes obtained by hand-crafted prompts and CLIP for 8 out of 11 test datasets. Finally, (c) we show that our zero-shot variant improves upon CLIP without requiring any extra data. Code will be made available.