Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
30 result(s) for "Seybold, Bryan"
Sort by:
Long-term modification of cortical synapses improves sensory perception
By pairing acoustic stimuli and electrical stimulation of the nucleus basalis neuromodulatory system in rats, the authors show an induction of long-lasting synaptic modifications of the auditory cortex that conserved excitation across the auditory receptive fields. This type of modification also improved auditory sensory detection and behavioral performance in tone perception. Synapses and receptive fields of the cerebral cortex are plastic. However, changes to specific inputs must be coordinated within neural networks to ensure that excitability and feature selectivity are appropriately configured for perception of the sensory environment. We induced long-lasting enhancements and decrements to excitatory synaptic strength in rat primary auditory cortex by pairing acoustic stimuli with activation of the nucleus basalis neuromodulatory system. Here we report that these synaptic modifications were approximately balanced across individual receptive fields, conserving mean excitation while reducing overall response variability. Decreased response variability should increase detection and recognition of near-threshold or previously imperceptible stimuli. We confirmed both of these hypotheses in behaving animals. Thus, modification of cortical inputs leads to wide-scale synaptic changes, which are related to improved sensory perception and enhanced behavioral performance.
3D mouse pose from single-view video and a new dataset
We present a method to infer the 3D pose of mice, including the limbs and feet, from monocular videos. Many human clinical conditions and their corresponding animal models result in abnormal motion, and accurately measuring 3D motion at scale offers insights into health. The 3D poses improve classification of health-related attributes over 2D representations. The inferred poses are accurate enough to estimate stride length even when the feet are mostly occluded. This method could be applied as part of a continuous monitoring system to non-invasively measure animal health, as demonstrated by its use in successfully classifying animals based on age and genotype. We introduce the Mouse Pose Analysis Dataset, the first large scale video dataset of lab mice in their home cage with ground truth keypoint and behavior labels. The dataset also contains high resolution mouse CT scans, which we use to build the shape models for 3D pose reconstruction.
Chronic reduction in inhibition reduces receptive field size in mouse auditory cortex
Inhibitory interneurons regulate the responses of cortical circuits. In auditory cortical areas, inhibition from these neurons narrows spectral tuning and shapes response dynamics. Acute disruptions of inhibition expand spectral receptive fields. However, the effects of long-term perturbations of inhibitory circuitry on auditory cortical responses are unknown. We ablated ∼30% of dendrite-targeting cortical inhibitory interneurons after the critical period by studying mice with a conditional deletion of Dlx1 . Following the loss of interneurons, baseline firing rates rose and tone-evoked responses became less sparse in auditory cortex. However, contrary to acute blockades of inhibition, the sizes of spectral receptive fields were reduced, demonstrating both higher thresholds and narrower bandwidths. Furthermore, long-latency responses at the edge of the receptive field were absent. On the basis of changes in response dynamics, the mechanism for the reduction in receptive field size appears to be a compensatory loss of cortico-cortically (CC) driven responses. Our findings suggest chronic conditions that feature changes in inhibitory circuitry are not likely to be well modeled by acute network manipulations, and compensation may be a critical component of chronic neuronal conditions.
Similar Auditory Cortical Suppression by Distinct Mechanisms: Homeostasis, Inhibition, and Background Noise
The auditory cortex is critical for the understanding of speech. This task is accomplished through the nonlinear network interactions between many neurons. Cortical neurons are grouped into distinct types depending on whether they release excitatory or inhibitory neurotransmitters and molecular, biophysical, and morphological properties. Because cortical network interactions are nonlinear, perturbing these networks can produce counterintuitive results. To understand how auditory cortex accomplishes complex tasks like speech comprehension, we need to understand how nonlinearities shape network processing. This dissertation provides examples in rodent auditory cortex of manipulations that produce straightforward effects in single cells or small portions of the parameter range, but, in some cases, opposite effects in cortical networks. In chapter 1, I chronically reduced the level of inhibition in the cortex using Dlx1 knockout mice, which should expand frequency tuning in auditory cortex, but observed reduced frequency tuning. Homeostatic changes over time nonlinearly changed expansion into reduction. In chapter 2, I acutely activated two populations of interneurons that express either somatostatin or parvalbumin, which produce different forms of linear suppression in vitro, but observed the same suppression in vivo. The nonlinear elements of the recurrent cortical network obscured the type of linear suppression. In chapter 3, I added background noise, which suppress tone-evoked firing rates quasi-linearly at low intensities, but observed that noise-related suppression increased nonlinearly with noise intensity. The nonlinear mechanisms that preserve stimulus information in the presence of noise are less robust at high noise intensities. In each chapter, nonlinear effects led to unexpected results, highlighting the need to interpret results in the context of nonlinear networks to understand cortical processing.
CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers
We extend multimodal transformers to include 3D camera motion as a conditioning signal for the task of video generation. Generative video models are becoming increasingly powerful, thus focusing research efforts on methods of controlling the output of such models. We propose to add virtual 3D camera controls to generative video methods by conditioning generated video on an encoding of three-dimensional camera movement over the course of the generated video. Results demonstrate that we are (1) able to successfully control the camera during video generation, starting from a single frame and a camera signal, and (2) we demonstrate the accuracy of the generated 3D camera paths using traditional computer vision methods.
What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
While there have been significant gains in the field of automated video description, the generalization performance of automated description models to novel domains remains a major barrier to using these systems in the real world. Most visual description methods are known to capture and exploit patterns in the training data leading to evaluation metric increases, but what are those patterns? In this work, we examine several popular visual description datasets, and capture, analyze, and understand the dataset-specific linguistic patterns that models exploit but do not generalize to new domains. At the token level, sample level, and dataset level, we find that caption diversity is a major driving factor behind the generation of generic and uninformative captions. We further show that state-of-the-art models even outperform held-out ground truth captions on modern metrics, and that this effect is an artifact of linguistic diversity in datasets. Understanding this linguistic diversity is key to building strong captioning models, we recommend several methods and approaches for maintaining diversity in the collection of new data, and dealing with the consequences of limited diversity when using current models and metrics.
Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features
Detecting actions in untrimmed videos should not be limited to a small, closed set of classes. We present a simple, yet effective strategy for open-vocabulary temporal action detection utilizing pretrained image-text co-embeddings. Despite being trained on static images rather than videos, we show that image-text co-embeddings enable openvocabulary performance competitive with fully-supervised models. We show that the performance can be further improved by ensembling the image-text features with features encoding local motion, like optical flow based features, or other modalities, like audio. In addition, we propose a more reasonable open-vocabulary evaluation setting for the ActivityNet data set, where the category splits are based on similarity rather than random assignment.
Video Foundation Models for Animal Behavior Analysis
Computational approaches leveraging computer vision and machine learning have transformed the quantification of animal behavior from video. However, existing methods often rely on task-specific features or models, which struggle to generalize across diverse datasets and tasks. Recent advances in machine learning, particularly the emergence of vision foundation models, i.e., large-scale models pre-trained on massive, diverse visual repositories, offers a way to tackle these challenges. Here, we investigate the potential of frozen video foundation models across a range of behavior analysis tasks, including classification, retrieval, and localization. We use a single, frozen model to extract general-purpose representations from video data, and perform extensive evaluations on diverse open-sourced animal behavior datasets. Our results demonstrate that features with minimal adaptation from foundation models achieve competitive performance compared to existing methods specifically designed for each dataset, across species, behaviors, and experimental contexts. This highlights the potential of frozen video foundation models as a powerful and accessible backbone for automated behavior analysis, with the ability to accelerate research across diverse fields from neuroscience, to ethology, and to ecology.
Learning Audio-Video Modalities from Image Captions
A major challenge in text-video and text-audio retrieval is the lack of large-scale training data. This is unlike image-captioning, where datasets are in the order of millions of samples. To close this gap we propose a new video mining pipeline which involves transferring captions from image captioning datasets to video clips with no additional manual effort. Using this pipeline, we create a new large-scale, weakly labelled audio-video captioning dataset consisting of millions of paired clips and captions. We show that training a multimodal transformed based model on this data achieves competitive performance on video retrieval and video captioning, matching or even outperforming HowTo100M pretraining with 20x fewer clips. We also show that our mined clips are suitable for text-audio pretraining, and achieve state of the art results for the task of audio retrieval.
Optical Mouse: 3D Mouse Pose From Single-View Video
We present a method to infer the 3D pose of mice, including the limbs and feet, from monocular videos. Many human clinical conditions and their corresponding animal models result in abnormal motion, and accurately measuring 3D motion at scale offers insights into health. The 3D poses improve classification of health-related attributes over 2D representations. The inferred poses are accurate enough to estimate stride length even when the feet are mostly occluded. This method could be applied as part of a continuous monitoring system to non-invasively measure animal health.