Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Reading Level
      Reading Level
      Clear All
      Reading Level
  • Content Type
      Content Type
      Clear All
      Content Type
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Item Type
    • Is Full-Text Available
    • Subject
    • Publisher
    • Source
    • Donor
    • Language
    • Place of Publication
    • Contributors
    • Location
818 result(s) for "Visuelle Aufmerksamkeit."
Sort by:
An Overview of the Attention Mechanisms in Computer Vision
Deep convolutional neural network (CNN) plays an important role in the field of computer vision and image processing. In order to further improve the performance of CNN, scholars have conducted a series of new explorations, such as the improvement of activation functions, the construction of new loss functions, the regularization of parameters and the development of new network structures. However, every breakthrough of CNN comes from the innovation of network structure, whose design can be inspired by exploring the cognitive process of human brain. As one of the important features of human visual system, visual attention mechanism is essential in image generation, scene classification, target detection and tracking when applied in the field of computer vision. Focusing on the models of attention mechanisms commonly used in computer vision, their categorizations, principles, and outlook are summarized in this overview.
Top-Down Neural Attention by Excitation Backprop
We aim to model the top-down attention of a convolutional neural network (CNN) classifier for generating task-specific attention maps. Inspired by a top-down human visual attention model, we propose a new backpropagation scheme, called Excitation Backprop, to pass along top-down signals downwards in the network hierarchy via a probabilistic Winner-Take-All process. Furthermore, we introduce the concept of contrastive attention to make the top-down attention maps more discriminative. We show a theoretic connection between the proposed contrastive attention formulation and the Class Activation Map computation. Efficient implementation of Excitation Backprop for common neural network layers is also presented. In experiments, we visualize the evidence of a model’s classification decision by computing the proposed top-down attention maps. For quantitative evaluation, we report the accuracy of our method in weakly supervised localization tasks on the MS COCO, PASCAL VOC07 and ImageNet datasets. The usefulness of our method is further validated in the text-to-region association task. On the Flickr30k Entities dataset, we achieve promising performance in phrase localization by leveraging the top-down attention of a CNN model that has been trained on weakly labeled web images. Finally, we demonstrate applications of our method in model interpretation and data annotation assistance for facial expression analysis and medical imaging tasks.
In search of the focus of attention in working memory: 13 years of the retro-cue effect
The concept of attention has a prominent place in cognitive psychology. Attention can be directed not only to perceptual information, but also to information in working memory (WM). Evidence for an internal focus of attention has come from the retro-cue effect : Performance in tests of visual WM is improved when attention is guided to the test-relevant contents of WM ahead of testing them. The retro-cue paradigm has served as a test bed to empirically investigate the functions and limits of the focus of attention in WM. In this article, we review the growing body of (behavioral) studies on the retro-cue effect. We evaluate the degrees of experimental support for six hypotheses about what causes the retro-cue effect: (1) Attention protects representations from decay, (2) attention prioritizes the selected WM contents for comparison with a probe display, (3) attended representations are strengthened in WM, (4) not-attended representations are removed from WM, (5) a retro-cue to the retrieval target provides a head start for its retrieval before decision making, and (6) attention protects the selected representation from perceptual interference. The extant evidence provides support for the last four of these hypotheses.
Salient object detection: A survey
Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision. While many models have been proposed and several applications have emerged, a deep understanding of achievements and issues remains lacking. We aim to provide a comprehensive review of recent progress in salient object detection and situate this field among other closely related areas such as generic scene segmentation, object proposal generation, and saliency for fixation prediction. Covering 228 publications, we survey i) roots, key concepts, and tasks, ii) core techniques and main modeling trends, and iii) datasets and evaluation metrics for salient object detection. We also discuss open problems such as evaluation metrics and dataset bias in model performance, and suggest future research directions.
Imagine, and you will find – Lack of attentional guidance through visual imagery in aphantasics
Aphantasia is the condition of reduced or absent voluntary imagery. So far, behavioural differences between aphantasics and non-aphantasics have hardly been studied as the base rate of those affected is quite low. The aim of the study was to examine if attentional guidance in aphantasics is impaired by their lack of visual imagery. In two visual search tasks, an already established one by Moriya ( Attention, Perception, & Psychophysics , 80 (5), 1127-1142, 2018) and a newly developed one, we examined whether aphantasics are primed less by their visual imagery than non-aphantasics. The sample in Study 1 consisted of 531 and the sample in Study 2 consisted of 325 age-matched pairs of aphantasics and non-aphantasics. Moriya’s Task was not capable of showing the expected effect, whereas the new developed task was. These results could mainly be attributed to different task characteristics. Therefore, a lack of attentional guidance through visual imagery in aphantasics can be assumed and interpreted as new evidence in the imagery debate, showing that mental images actually influence information processing and are not merely epiphenomena of propositional processing.
Contextual facilitation: Separable roles of contextual guidance and context suppression in visual search
Visual search is facilitated when targets are repeatedly encountered at a fixed position relative to an invariant distractor layout, compared to random distractor arrangements. However, standard investigations of this contextual-facilitation effect employ fixed distractor layouts that predict a constant target location, which does not always reflect real-world situations where the target location may vary relative to an invariant distractor arrangement. To explore the mechanisms involved in contextual learning, we employed a training-test procedure, introducing not only the standard full-repeated displays with fixed target-distractor locations but also distractor-repeated displays in which the distractor arrangement remained unchanged but the target locations varied. During the training phase, participants encountered three types of display: full-repeated, distractor-repeated, and random arrangements. The results revealed full-repeated displays to engender larger performance gains than distractor-repeated displays, relative to the random-display baseline. In the test phase, the gains were substantially reduced when full-repeated displays changed into distractor-repeated displays, while the transition from distractor-repeated to full-repeated displays failed to yield additional gains. We take this pattern to indicate that contextual learning can improve performance with both predictive and non-predictive (repeated) contexts, employing distinct mechanisms: contextual guidance and context suppression, respectively. We consider how these mechanisms might be implemented (neuro-)computationally.
A meta-analysis of contingent-capture effects
The present meta-analyses investigated the widely used contingent-capture protocol. Contingent-capture theory postulates that only top-down matching stimuli capture attention. Evidence comes from the contingent-capture protocol, in which participants search for a predefined target stimulus preceded by a spatial cue. The cue is typically uninformative of the target’s position but either presented at target position (valid condition) or away from the target (invalid condition). The common finding is that seemingly only top-down matching cues capture attention as shown by a selective cueing effect (faster responses in valid than invalid conditions) for cues with a feature similar to the searched-for target only, but not for cues without target-similar feature. The origin of this “contingent-capture effect” is, however, debated. One alternative explanation is that intertrial priming—the priming of attention capture by the cue in a given trial by attending to a feature-similar target in the preceding trial—mediates the contingent-capture effect. Alternatively, the rapid-disengagement account argues that all salient stimuli capture attention initially, but that the disengagement from non-matching cues is rapid. The present meta-analyses shed light on this debate by (a) identifying moderators of the size of reported contingent-capture effects (64 experiments) and (b) analyzing pure (blocked) versus mixed presentation of different targets as well as summarizing results of published intertrial priming studies (12 experiments) in the contingent-capture protocol. We found target-singleton versus non-singleton status and pure versus mixed presentation of different targets to be reliable moderators. Furthermore, results indicated the presence of publication bias. Otherwise, the contingent-capture theory was supported, but we discuss additional factors that must be taken into account for a full account of the results.
Individual differences in visual salience vary along semantic dimensions
What determines where we look? Theories of attentional guidance hold that image features and task demands govern fixation behavior, while differences between observers are interpreted as a “noise-ceiling” that strictly limits predictability of fixations. However, recent twin studies suggest a genetic basis of gaze-trace similarity for a given stimulus. This leads to the question of how individuals differ in their gaze behavior and what may explain these differences. Here, we investigated the fixations of >100 human adults freely viewing a large set of complex scenes containing thousands of semantically annotated objects. We found systematic individual differences in fixation frequencies along six semantic stimulus dimensions. These differences were large (>twofold) and highly stable across images and time. Surprisingly, they also held for first fixations directed toward each image, commonly interpreted as “bottom-up” visual salience. Their perceptual relevance was documented by a correlation between individual face salience and face recognition skills. The set of reliable individual salience dimensions and their covariance pattern replicated across samples from three different countries, suggesting they reflect fundamental biological mechanisms of attention. Our findings show stable individual differences in salience along a set of fundamental semantic dimensions and that these differences have meaningful perceptual implications. Visual salience reflects features of the observer as well as the image.
Studying visual attention using the multiple object tracking paradigm: A tutorial review
Human observers are capable of tracking multiple objects among identical distractors based only on their spatiotemporal information. Since the first report of this ability in the seminal work of Pylyshyn and Storm ( 1988 , Spatial Vision, 3, 179–197), multiple object tracking has attracted many researchers. A reason for this is that it is commonly argued that the attentional processes studied with the multiple object paradigm apparently match the attentional processing during real-world tasks such as driving or team sports. We argue that multiple object tracking provides a good mean to study the broader topic of continuous and dynamic visual attention. Indeed, several (partially contradicting) theories of attentive tracking have been proposed within the almost 30 years since its first report, and a large body of research has been conducted to test these theories. With regard to the richness and diversity of this literature, the aim of this tutorial review is to provide researchers who are new in the field of multiple object tracking with an overview over the multiple object tracking paradigm, its basic manipulations, as well as links to other paradigms investigating visual attention and working memory. Further, we aim at reviewing current theories of tracking as well as their empirical evidence. Finally, we review the state of the art in the most prominent research fields of multiple object tracking and how this research has helped to understand visual attention in dynamic settings.