Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Language
      Language
      Clear All
      Language
  • Subject
      Subject
      Clear All
      Subject
  • Item Type
      Item Type
      Clear All
      Item Type
  • Discipline
      Discipline
      Clear All
      Discipline
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
178 result(s) for "LeCun, Yann"
Sort by:
To Compress or Not to Compress—Self-Supervised Learning and Information Theory: A Review
Deep neural networks excel in supervised learning tasks but are constrained by the need for extensive labeled data. Self-supervised learning emerges as a promising alternative, allowing models to learn without explicit labels. Information theory has shaped deep neural networks, particularly the information bottleneck principle. This principle optimizes the trade-off between compression and preserving relevant information, providing a foundation for efficient network design in supervised contexts. However, its precise role and adaptation in self-supervised learning remain unclear. In this work, we scrutinize various self-supervised learning approaches from an information-theoretic perspective, introducing a unified framework that encapsulates the self-supervised information-theoretic learning problem. This framework includes multiple encoders and decoders, suggesting that all existing work on self-supervised learning can be seen as specific instances. We aim to unify these approaches to understand their underlying principles better and address the main challenge: many works present different frameworks with differing theories that may seem contradictory. By weaving existing research into a cohesive narrative, we delve into contemporary self-supervised methodologies, spotlight potential research areas, and highlight inherent challenges. Moreover, we discuss how to estimate information-theoretic quantities and their associated empirical problems. Overall, this paper provides a comprehensive review of the intersection of information theory, self-supervised learning, and deep neural networks, aiming for a better understanding through our proposed unified approach.
A hierarchical loss and its problems when classifying non-hierarchically
Failing to distinguish between a sheepdog and a skyscraper should be worse and penalized more than failing to distinguish between a sheepdog and a poodle; after all, sheepdogs and poodles are both breeds of dogs. However, existing metrics of failure (so-called \"loss\" or \"win\") used in textual or visual classification/recognition via neural networks seldom leverage a-priori information, such as a sheepdog being more similar to a poodle than to a skyscraper. We define a metric that, inter alia, can penalize failure to distinguish between a sheepdog and a skyscraper more than failure to distinguish between a sheepdog and a poodle. Unlike previously employed possibilities, this metric is based on an ultrametric tree associated with any given tree organization into a semantically meaningful hierarchy of a classifier's classes. An ultrametric tree is a tree with a so-called ultrametric distance metric such that all leaves are at the same distance from the root. Unfortunately, extensive numerical experiments indicate that the standard practice of training neural networks via stochastic gradient descent with random starting points often drives down the hierarchical loss nearly as much when minimizing the standard cross-entropy loss as when trying to minimize the hierarchical loss directly. Thus, this hierarchical loss is unreliable as an objective for plain, randomly started stochastic gradient descent to minimize; the main value of the hierarchical loss may be merely as a meaningful metric of success of a classifier.
Catalyzing next-generation Artificial Intelligence through NeuroAI
Neuroscience has long been an essential driver of progress in artificial intelligence (AI). We propose that to accelerate progress in AI, we must invest in fundamental research in NeuroAI. A core component of this is the embodied Turing test, which challenges AI animal models to interact with the sensorimotor world at skill levels akin to their living counterparts. The embodied Turing test shifts the focus from those capabilities like game playing and language that are especially well-developed or uniquely human to those capabilities – inherited from over 500 million years of evolution – that are shared with all animals. Building models that can pass the embodied Turing test will provide a roadmap for the next generation of AI. One of the ambitions of computational neuroscience is that we will continue to make improvements in the field of artificial intelligence that will be informed by advances in our understanding of how the brains of various species evolved to process information. To that end, here the authors propose an expanded version of the Turing test that involves embodied sensorimotor interactions with the world as a new framework for accelerating progress in artificial intelligence.
Feature learning and deep architectures: new directions for music informatics
As we look to advance the state of the art in content-based music informatics, there is a general sense that progress is decelerating throughout the field. On closer inspection, performance trajectories across several applications reveal that this is indeed the case, raising some difficult questions for the discipline: why are we slowing down, and what can we do about it? Here, we strive to address both of these concerns. First, we critically review the standard approach to music signal analysis and identify three specific deficiencies to current methods: hand-crafted feature design is sub-optimal and unsustainable, the power of shallow architectures is fundamentally limited, and short-time analysis cannot encode musically meaningful structure. Acknowledging breakthroughs in other perceptual AI domains, we offer that deep learning holds the potential to overcome each of these obstacles. Through conceptual arguments for feature learning and deeper processing architectures, we demonstrate how deep processing models are more powerful extensions of current methods, and why now is the time for this paradigm shift. Finally, we conclude with a discussion of current challenges and the potential impact to further motivate an exploration of this promising research area.
Deep learning
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Universal halting times in optimization and machine learning
We present empirical evidence that the halting times for a class of optimization algorithms are universal. The algorithms we consider come from quadratic optimization, spin glasses and machine learning. A universality theorem is given in the case of the quadratic gradient descent flow. More precisely, given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time of the algorithm follow a distribution that, after centering and scaling, appears invariant under changes in the distribution on the landscape — universality is present.
Unscented Kalman Filter for Brain-Machine Interfaces
Brain machine interfaces (BMIs) are devices that convert neural signals into commands to directly control artificial actuators, such as limb prostheses. Previous real-time methods applied to decoding behavioral commands from the activity of populations of neurons have generally relied upon linear models of neural tuning and were limited in the way they used the abundant statistical information contained in the movement profiles of motor tasks. Here, we propose an n-th order unscented Kalman filter which implements two key features: (1) use of a non-linear (quadratic) model of neural tuning which describes neural activity significantly better than commonly-used linear tuning models, and (2) augmentation of the movement state variables with a history of n-1 recent states, which improves prediction of the desired command even before incorporating neural activity information and allows the tuning model to capture relationships between neural activity and movement at multiple time offsets simultaneously. This new filter was tested in BMI experiments in which rhesus monkeys used their cortical activity, recorded through chronically implanted multielectrode arrays, to directly control computer cursors. The 10th order unscented Kalman filter outperformed the standard Kalman filter and the Wiener filter in both off-line reconstruction of movement trajectories and real-time, closed-loop BMI operation.