Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
108 result(s) for "Fleming, Roland"
Sort by:
An image-computable model of human visual shape similarity
Shape is a defining feature of objects, and human observers can effortlessly compare shapes to determine how similar they are. Yet, to date, no image-computable model can predict how visually similar or different shapes appear. Such a model would be an invaluable tool for neuroscientists and could provide insights into computations underlying human shape perception. To address this need, we developed a model (‘ShapeComp’), based on over 100 shape features (e.g., area, compactness, Fourier descriptors). When trained to capture the variance in a database of >25,000 animal silhouettes, ShapeComp accurately predicts human shape similarity judgments between pairs of shapes without fitting any parameters to human data. To test the model, we created carefully selected arrays of complex novel shapes using a Generative Adversarial Network trained on the animal silhouettes, which we presented to observers in a wide range of tasks. Our findings show that incorporating multiple ShapeComp dimensions facilitates the prediction of human shape similarity across a small number of shapes, and also captures much of the variance in the multiple arrangements of many shapes. ShapeComp outperforms both conventional pixel-based metrics and state-of-the-art convolutional neural networks, and can also be used to generate perceptually uniform stimulus sets, making it a powerful tool for investigating shape and object representations in the human brain.
Predicting precision grip grasp locations on three-dimensional objects
We rarely experience difficulty picking up objects, yet of all potential contact points on the surface, only a small proportion yield effective grasps. Here, we present extensive behavioral data alongside a normative model that correctly predicts human precision grasping of unfamiliar 3D objects. We tracked participants' forefinger and thumb as they picked up objects of 10 wood and brass cubes configured to tease apart effects of shape, weight, orientation, and mass distribution. Grasps were highly systematic and consistent across repetitions and participants. We employed these data to construct a model which combines five cost functions related to force closure, torque, natural grasp axis, grasp aperture, and visibility. Even without free parameters, the model predicts individual grasps almost as well as different individuals predict one another's, but fitting weights reveals the relative importance of the different constraints. The model also accurately predicts human grasps on novel 3D-printed objects with more naturalistic geometries and is robust to perturbations in its key parameters. Together, the findings provide a unified account of how we successfully grasp objects of different 3D shape, orientation, mass, and mass distribution.
The Veiled Virgin illustrates visual segmentation of shape by cause
Three-dimensional (3D) shape perception is one of the most important functions of vision. It is crucial for many tasks, from object recognition to tool use, and yet how the brain represents shape remains poorly understood. Most theories focus on purely geometrical computations (e.g., estimating depths, curvatures, symmetries). Here, however, we find that shape perception also involves sophisticated inferences that parse shapes into features with distinct causal origins. Inspired by marble sculptures such as Strazza’s The Veiled Virgin (1850), which vividly depict figures swathed in cloth, we created composite shapes by wrapping unfamiliar forms in textile, so that the observable surface relief was the result of complex interactions between the underlying object and overlying fabric. Making sense of such structures requires segmenting the shape based on their causes, to distinguish whether lumps and ridges are due to the shrouded object or to the ripples and folds of the overlying cloth. Three-dimensional scans of the objects with and without the textile provided ground-truth measures of the true physical surface reliefs, against which observers’ judgments could be compared. In a virtual painting task, participants indicated which surface ridges appeared to be caused by the hidden object and which were due to the drapery. In another experiment, participants indicated the perceived depth profile of both surface layers. Their responses reveal that they can robustly distinguish features belonging to the textile from those due to the underlying object. Together, these findings reveal the operation of visual shape-segmentation processes that parse shapes based on their causal origin.
Unsupervised learning predicts human perception and misperception of gloss
Reflectance, lighting and geometry combine in complex ways to create images. How do we disentangle these to perceive individual properties, such as surface glossiness? We suggest that brains disentangle properties by learning to model statistical structure in proximal images. To test this hypothesis, we trained unsupervised generative neural networks on renderings of glossy surfaces and compared their representations with human gloss judgements. The networks spontaneously cluster images according to distal properties such as reflectance and illumination, despite receiving no explicit information about these properties. Intriguingly, the resulting representations also predict the specific patterns of ‘successes’ and ‘errors’ in human perception. Linearly decoding specular reflectance from the model’s internal code predicts human gloss perception better than ground truth, supervised networks or control models, and it predicts, on an image-by-image basis, illusions of gloss perception caused by interactions between material, shape and lighting. Unsupervised learning may underlie many perceptual dimensions in vision and beyond. Storrs et al. train unsupervised generative neural networks on glossy surfaces and show how gloss perception in humans may emerge in an unsupervised fashion from learning to model statistical structure.
Identifying shape transformations from photographs of real objects
An important task of human visual cognition is to make inferences about properties of objects. One such property is an object's causal history: what happened to the object in its past (e.g., \"this paper has been folded\"). There is relatively little research on whether and how we make such inferences. We took photographs of objects from six different materials ('wax', 'aluminum foil', 'gold foil', 'chicken wire', 'putty', 'cardboard') transformed by one of four shape-altering transformations ('folded', 'bent', 'crumpled', 'twisted'). By varying execution of transformation and viewpoint, we obtained 30 images of each material/transformation combination (720 images). We asked different groups of participants to: (1) name transformations and materials, (2) rate images with respect to the extent they belonged to each transformation or material class, and (3) classify images into the four transformation classes. Our results show that participants can infer transformations from object shape-with accuracy being modulated by object material. This inference of causal history from observed object shape shows that we can distinguish between intrinsic (material) and extrinsic (transformation) properties of the object. The separation of observed shape features by their causal origin ('shape scission') presumably involves both perceptual and cognitive abilities.
Inferring shape transformations in a drawing task
Many objects and materials in our environment are subject to transformations that alter their shape. For example, branches bend in the wind, ice melts, and paper crumples. Still, we recognize objects and materials across these changes, suggesting we can distinguish an object’s original features from those caused by the transformations (“shape scission”). Yet, if we truly understand transformations, we should not only be able to identify their signatures but also actively apply the transformations to new objects (i.e., through imagination or mental simulation). Here, we investigated this ability using a drawing task. On a tablet computer, participants viewed a sample contour and its transformed version, and were asked to apply the same transformation to a test contour by drawing what the transformed test shape should look like. Thus, they had to (i) infer the transformation from the shape differences, (ii) envisage its application to the test shape, and (iii) draw the result. Our findings show that drawings were more similar to the ground truth transformed test shape than to the original test shape—demonstrating the inference and reproduction of transformations from observation. However, this was only observed for relatively simple shapes. The ability was also modulated by transformation type and magnitude but not by the similarity between sample and test shapes. Together, our findings suggest that we can distinguish between representations of original object shapes and their transformations, and can use visual imagery to mentally apply nonrigid transformations to observed objects, showing how we not only perceive but also ‘understand’ shape.
Visual perception of liquids: Insights from deep neural networks
Visually inferring material properties is crucial for many tasks, yet poses significant computational challenges for biological vision. Liquids and gels are particularly challenging due to their extreme variability and complex behaviour. We reasoned that measuring and modelling viscosity perception is a useful case study for identifying general principles of complex visual inferences. In recent years, artificial Deep Neural Networks (DNNs) have yielded breakthroughs in challenging real-world vision tasks. However, to model human vision, the emphasis lies not on best possible performance, but on mimicking the specific pattern of successes and errors humans make. We trained a DNN to estimate the viscosity of liquids using 100.000 simulations depicting liquids with sixteen different viscosities interacting in ten different scenes (stirring, pouring, splashing, etc). We find that a shallow feedforward network trained for only 30 epochs predicts mean observer performance better than most individual observers. This is the first successful image-computable model of human viscosity perception. Further training improved accuracy, but predicted human perception less well. We analysed the network's features using representational similarity analysis (RSA) and a range of image descriptors (e.g. optic flow, colour saturation, GIST). This revealed clusters of units sensitive to specific classes of feature. We also find a distinct population of units that are poorly explained by hand-engineered features, but which are particularly important both for physical viscosity estimation, and for the specific pattern of human responses. The final layers represent many distinct stimulus characteristics-not just viscosity, which the network was trained on. Retraining the fully-connected layer with a reduced number of units achieves practically identical performance, but results in representations focused on viscosity, suggesting that network capacity is a crucial parameter determining whether artificial or biological neural networks use distributed vs. localized representations.
One-shot generalization in humans revealed through a drawing task
Humans have the amazing ability to learn new visual concepts from just a single exemplar. How we achieve this remains mysterious. State-of-the-art theories suggest observers rely on internal ‘generative models’, which not only describe observed objects, but can also synthesize novel variations. However, compelling evidence for generative models in human one-shot learning remains sparse. In most studies, participants merely compare candidate objects created by the experimenters, rather than generating their own ideas. Here, we overcame this key limitation by presenting participants with 2D ‘Exemplar’ shapes and asking them to draw their own ‘Variations’ belonging to the same class. The drawings reveal that participants inferred—and synthesized—genuine novel categories that were far more varied than mere copies. Yet, there was striking agreement between participants about which shape features were most distinctive, and these tended to be preserved in the drawn Variations. Indeed, swapping distinctive parts caused objects to swap apparent category. Our findings suggest that internal generative models are key to how humans generalize from single exemplars. When observers see a novel object for the first time, they identify its most distinctive features and infer a generative model of its shape, allowing them to mentally synthesize plausible variants.
Where do the hypotheses come from? Data-driven learning in science and the brain
Everyone agrees that testing hypotheses is important, but Bowers et al. provide scant details about where hypotheses about perception and brain function should come from. We suggest that the answer lies in considering how information about the outside world could be acquired – that is, learned – over the course of evolution and development. Deep neural networks (DNNs) provide one tool to address this question.
Effects of material properties and object orientation on precision grip kinematics
Successfully picking up and handling objects requires taking into account their physical properties (e.g., material) and position relative to the body. Such features are often inferred by sight, but it remains unclear to what extent observers vary their actions depending on the perceived properties. To investigate this, we asked participants to grasp, lift and carry cylinders to a goal location with a precision grip. The cylinders were made of four different materials (Styrofoam, wood, brass and an additional brass cylinder covered with Vaseline) and were presented at six different orientations with respect to the participant (0°, 30°, 60°, 90°, 120°, 150°). Analysis of their grasping kinematics revealed differences in timing and spatial modulation at all stages of the movement that depended on both material and orientation. Object orientation affected the spatial configuration of index finger and thumb during the grasp, but also the timing of handling and transport duration. Material affected the choice of local grasp points and the duration of the movement from the first visual input until release of the object. We find that conditions that make grasping more difficult (orientation with the base pointing toward the participant, high weight and low surface friction) lead to longer durations of individual movement segments and a more careful placement of the fingers on the object.