Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
70 result(s) for "Avron, Haim"
Sort by:
Tensor-tensor algebra for optimal representation and compression of multiway data
With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has been the matrix singular value decomposition (SVD), which presupposes that data have been arranged in matrix format. A primary goal in this study is to show that high-dimensional datasets are more compressible when treated as tensors (i.e., multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product constructs and its generalizations. We begin by proving Eckart–Young optimality results for families of tensor-SVDs under two different truncation strategies. Since such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: Does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is positive, as proven by showing that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then use these optimality results to investigate how the compressed representation provided by the truncated tensor SVD is related both theoretically and empirically to its two closest tensor-based analogs, the truncated high-order SVD and the truncated tensor-train SVD.
Faster Subset Selection for Matrices and Applications
We study the following problem of subset selection for matrices: given a matrix $\\mathbf{X} \\in \\mathbb{R}^{n \\times m}$ ($m > n$) and a sampling parameter $k$ ($n \\le k \\le m$), select a subset of $k$ columns from $\\mathbf{X}$ such that the pseudoinverse of the sampled matrix has as small a norm as possible. In this work, we focus on the Frobenius and the spectral matrix norms. We describe several novel (deterministic and randomized) approximation algorithms for this problem with approximation bounds that are optimal up to constant factors. Additionally, we show that the combinatorial problem of finding a low-stretch spanning tree in an undirected graph corresponds to subset selection, and discuss various implications of this reduction. [PUBLICATION ABSTRACT]
Dimensionality reduction of longitudinal ’omics data using modern tensor factorizations
Longitudinal ’omics analytical methods are extensively used in the evolving field of precision medicine, by enabling ‘big data’ recording and high-resolution interpretation of complex datasets, driven by individual variations in response to perturbations such as disease pathogenesis, medical treatment or changes in lifestyle. However, inherent technical limitations in biomedical studies often result in the generation of feature-rich and sample-limited datasets. Analyzing such data using conventional modalities often proves to be challenging since the repeated, high-dimensional measurements overload the outlook with inconsequential variations that must be filtered from the data in order to find the true, biologically relevant signal. Tensor methods for the analysis and meaningful representation of multiway data may prove useful to the biological research community by their advertised ability to tackle this challenge. In this study, we present tcam —a new unsupervised tensor factorization method for the analysis of multiway data. Building on top of cutting-edge developments in the field of tensor-tensor algebra, we characterize the unique mathematical properties of our method, namely, 1) preservation of geometric and statistical traits of the data, which enable uncovering information beyond the inter-individual variation that often takes over the focus, especially in human studies. 2) Natural and straightforward out-of-sample extension, making tcam amenable for integration in machine learning workflows. A series of re-analyses of real-world, human experimental datasets showcase these theoretical properties, while providing empirical confirmation of tcam ’s utility in the analysis of longitudinal ’omics data.
Blendenpik: Supercharging LAPACK's Least-Squares Solver
Several innovative random-sampling and random-mixing techniques for solving problems in linear algebra have been proposed in the last decade, but they have not yet made a significant impact on numerical linear algebra. We show that by using a high-quality implementation of one of these techniques, we obtain a solver that performs extremely well in the traditional yardsticks of numerical linear algebra: it is significantly faster than high-performance implementations of existing state-of-the-art algorithms, and it is numerically backward stable. More specifically, we describe a least-squares solver for dense highly overdetermined systems that achieves residuals similar to those of direct QR factorization-based solvers (LAPACK), outperforms LAPACK by large factors, and scales significantly better than any QR-based solver. [PUBLICATION ABSTRACT]
High-Performance Kernel Machines With Implicit Distributed Optimization and Randomization
We propose a framework for massive-scale training of kernel-based statistical models, based on combining distributed convex optimization with randomization techniques. Our approach is based on a block-splitting variant of the alternating directions method of multipliers, carefully reconfigured to handle very large random feature matrices under memory constraints, while exploiting hybrid parallelism typically found in modern clusters of multicore machines. Our high-performance implementation supports a variety of statistical learning tasks by enabling several loss functions, regularization schemes, kernels, and layers of randomized approximations for both dense and sparse datasets, in an extensible framework. We evaluate our implementation on large-scale model construction tasks and provide a comparison against existing sequential and parallel libraries. Supplementary materials for this article are available online.