Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
100 result(s) for "Tsybakov, Alexandre B."
Sort by:
ORACLE INEQUALITIES FOR NETWORK MODELS AND SPARSE GRAPHON ESTIMATION
Inhomogeneous random graph models encompass many network models such as stochastic block models and latent position models. We consider the problem of statistical estimation of the matrix of connection probabilities based on the observations of the adjacency matrix of the network. Taking the stochastic block model as an approximation, we construct estimators of network connection probabilities—the ordinary block constant least squares estimator, and its restricted version. We show that they satisfy oracle inequalities with respect to the block constant oracle. As a consequence, we derive optimal rates of estimation of the probability matrix. Our results cover the important setting of sparse networks. Another consequence consists in establishing upper bounds on the minimax risks for graphon estimation in the L₂ norm when the probability matrix is sampled according to a graphon model. These bounds include an additional term accounting for the \"agnostic\" error induced by the variability of the latent unobserved variables of the graphon model. In this setting, the optimal rates are influenced not only by the bias and variance components as in usual nonparametric problems but also include the third component, which is the agnostic error. The results shed light on the differences between estimation under the empirical loss (the probability matrix estimation) and under the integrated loss (the graphon estimation).
NUCLEAR-NORM PENALIZATION AND OPTIMAL RATES FOR NOISY LOW-RANK MATRIX COMPLETION
This paper deals with the trace regression model where n entries or linear combinations of entries of an unknown m₁ x m₂ matrix A₀ corrupted by noise are observed. We propose a new nuclear-norm penalized estimator of A₀ and establish a general sharp oracle inequality for this estimator for arbitrary values of n, m₁, m₂ under the condition of isometry in expectation. Then this method is applied to the matrix completion problem. In this case, the estimator admits a simple explicit form, and we prove that it satisfies oracle inequalities with faster rates of convergence than in the previous works. They are valid, in particular, in the high-dimensional setting m₁ m₂ ≫ n. We show that the obtained rates are optimal up to logarithmic factors in a minimax sense and also derive, for any fixed matrix A₀, a nonminimax lower bound on the rate of convergence of our estimator, which coincides with the upper bound up to a constant factor. Finally, we show that our procedure provides an exact recovery of the rank of A₀ with probability close to 1. We also discuss the statistical learning setting where there is no underlying model determined by A₀, and the aim is to find the best trace regression model approximating the data. As a by-product, we show that, under the restricted eigenvalue condition, the usual vector Lasso estimator satisfies a sharp oracle inequality (i.e., an oracle inequality with leading constant 1).
ESTIMATION OF HIGH-DIMENSIONAL LOW-RANK MATRICES
Suppose that we observe entries or, more generally, linear combinations of entries of an unknown m × T -matrix A corrupted by noise. We are particularly interested in the high-dimensional setting where the number mT of unknown entries can be much larger than the sample size N. Motivated by several applications, we consider estimation of matrix A under the assumption that it has small rank. This can be viewed as dimension reduction or sparsity assumption. In order to shrink toward a low-rank representation, we investigate penalized least squares estimators with a Schatten-p quasinorm penalty term, p ≤ 1. We study these estimators under two possible assumptions—a modified version of the restricted isometry condition and a uniform bound on the ratio \"empirical norm induced by the sampling operator/Frobenius norm.\" The main results are stated as nonasymptotic upper bounds on the prediction risk and on the Schatten-q risk of the estimators, where q ∈ [p, 2]. The rates that we obtain for the prediction risk are of the form rm/N (for m = T), up to logarithmic factors, where r is the rank of A. The particular examples of multi-task learning and matrix completion are worked out in detail. The proofs are based on tools from the theory of empirical processes. As a by-product, we derive bounds for the kth entropy numbers of the quasi-convex Schatten class embeddings $S_{p}^{M}\\hookrightarrow S_{2}^{M}$ , p < 1, which are of independent interest.
ORACLE INEQUALITIES AND OPTIMAL INFERENCE UNDER GROUP SPARSITY
We consider the problem of estimating a sparse linear regression vector β * under a Gaussian noise model, for the purpose of both prediction and model selection. We assume that prior knowledge is available on the sparsity pattern, namely the set of variables is partitioned into prescribed groups, only few of which are relevant in the estimation process. This group sparsity assumption suggests us to consider the Group Lasso method as a means to estimate β * . We establish oracle inequalities for the prediction and ℓ 2 estimation errors of this estimator. These bounds hold under a restricted eigenvalue condition on the design matrix. Under a stronger condition, we derive bounds for the estimation error for mixed (2, p)-norms with 1 ≤ p ≤ ∞. When p = ∞, this result implies that a thresholded version of the Group Lasso estimator selects the sparsity pattern of β * with high probability. Next, we prove that the rate of convergence of our upper bounds is optimal in a minimax sense, up to a logarithmic factor, for all estimators over a class of group sparse vectors. Furthermore, we establish lower bounds for the prediction and ℓ 2 estimation errors of the usual Lasso estimator. Using this result, we demonstrate that the Group Lasso can achieve an improvement in the prediction and estimation errors as compared to the Lasso. An important application of our results is provided by the problem of estimating multiple regression equations simultaneously or multi-task learning. In this case, we obtain refinements of the results in [In Proc. of the 22nd Annual Conference on Learning Theory (COLT) (2009)], which allow us to establish a quantitative advantage of the Group Lasso over the usual Lasso in the multi-task setting. Finally, within the same setting, we show how our results can be extended to more general noise distributions, of which we only require the fourth moment to be finite. To obtain this extension, we establish a new maximal moment inequality, which may be of independent interest.
SLOPE MEETS LASSO
We show that two polynomial time methods, a Lasso estimator with adaptively chosen tuning parameter and a Slope estimator, adaptively achieve the minimax prediction and ℓ₂ estimation rate (s/n) log(p/s) in high-dimensional linear regression on the class of s-sparse vectors in ℝp. This is done under the Restricted Eigenvalue (RE) condition for the Lasso and under a slightly more constraining assumption on the design for the Slope. The main results have the form of sharp oracle inequalities accounting for the model misspecification error. The minimax optimal bounds are also obtained for the ℓq estimation errors with 1 ≤ q ≤ 2 when the model is well specified. The results are nonasymptotic, and hold both in probability and in expectation. The assumptions that we impose on the design are satisfied with high probability for a large class of random matrices with independent and possibly anisotropically distributed rows. We give a comparative analysis of conditions, under which oracle bounds for the Lasso and Slope estimators can be obtained. In particular, we show that several known conditions, such as the RE condition and the sparse eigenvalue condition are equivalent if the ℓ₂-norms of regressors are uniformly bounded.
Robust matrix completion
This paper considers the problem of estimation of a low-rank matrix when most of its entries are not observed and some of the observed entries are corrupted. The observations are noisy realizations of a sum of a low-rank matrix, which we wish to estimate, and a second matrix having a complementary sparse structure such as elementwise sparsity or columnwise sparsity. We analyze a class of estimators obtained as solutions of a constrained convex optimization problem combining the nuclear norm penalty and a convex relaxation penalty for the sparse constraint. Our assumptions allow for simultaneous presence of random and deterministic patterns in the sampling scheme. We establish rates of convergence for the low-rank component from partial and corrupted observations in the presence of noise and we show that these rates are minimax optimal up to logarithmic factors.
MINIMAX ESTIMATION OF LINEAR AND QUADRATIC FUNCTIONALS ON SPARSITY CLASSES
For the Gaussian sequence model, we obtain nonasymptotic minimax rates of estimation of the linear, quadratic and the ℓ2-norm functionals on classes of sparse vectors and construct optimal estimators that attain these rates. The main object of interest is the class B0(s) of s-sparse vectors θ = (θ1,..., θd), for which we also provide completely adaptive estimators (independent of s and of the noise variance σ) having logarithmically slower rates than the minimax ones. Furthermore, we obtain the minimax rates on the ℓq-balls Bq(r) = {θ ϵ ℝd : ∥θ∥q ≤ r} where 0 < q ≤ 2, and ${\\Vert \\mathrm{\\theta }\\Vert }_{\\mathrm{q}}={\\left({\\mathrm{\\Sigma }}_{\\mathrm{i}=1}^{\\mathrm{d}}|{\\mathrm{\\theta }}_{\\mathrm{i}}{|}^{\\mathrm{q}}\\right)}^{1/\\mathrm{q}}$. This analysis shows that there are, in general, three zones in the rates of convergence that we call the sparse zone, the dense zone and the degenerate zone, while a fourth zone appears for estimation of the quadratic functional. We show that, as opposed to estimation of θ, the correct logarithmic terms in the optimal rates for the sparse zone scale as log(d/s2) and not as log(d/s). For the class B0(s), the rates of estimation of the linear functional and of the ℓ2-norm have a simple elbow at s = √d (boundary between the sparse and the dense zones) and exhibit similar performances, whereas the estimation of the quadratic functional Q(θ) reveals more complex effects: the minimax risk on B0(s) is infinite and the sparseness assumption needs to be combined with a bound on the ℓ2-norm. Finally, we apply our results on estimation of the ℓ2-norm to the problem of testing against sparse alternatives. In particular, we obtain a nonasymptotic analog of the Ingster–Donoho–Jin theory revealing some effects that were not captured by the previous asymptotic analysis.
Linear and conic programming estimators in high dimensional errors-in-variables models
We consider the linear regression model with observation error in the design. In this setting, we allow the number of covariates to be much larger than the sample size. Several new estimation methods have been recently introduced for this model. Indeed, the standard lasso estimator or Dantzig selector turns out to become unreliable when only noisy regressors are available, which is quite common in practice. In this work, we propose and analyse a new estimator for the errors-in-variables model. Under suitable sparsity assumptions, we show that this estimator attains the minimax efficiency bound. Importantly, this estimator can be written as a second-order cone programming minimization problem which can be solved numerically in polynomial time. Finally, we show that the procedure introduced by Rosenbaum and Tsybakov, which is almost optimal in a minimax sense, can be efficiently computed by a single linear programming problem despite non-convexities.
VARIABLE SELECTION WITH HAMMING LOSS
We derive nonasymptotic bounds for the minimax risk of variable selection under expected Hamming loss in the Gaussian mean model in ℝd for classes of at most s-sparse vectors separated from 0 by a constant a > 0. In some cases, we get exact expressions for the nonasymptotic minimax risk as a function of d, s, a and find explicitly the minimax selectors. These results are extended to dependent or non-Gaussian observations and to the problem of crowdsourcing. Analogous conclusions are obtained for the probability of wrong recovery of the sparsity pattern. As corollaries, we derive necessary and sufficient conditions for such asymptotic properties as almost full recovery and exact recovery. Moreover, we propose data-driven selectors that provide almost full and exact recovery adaptively to the parameters of the classes.
Simultaneous Analysis of Lasso and Dantzig Selector
We show that, under a sparsity scenario, the Lasso estimator and the Dantzig selector exhibit similar behavior. For both methods, we derive, in parallel, oracle inequalities for the prediction risk in the general nonparametric regression model, as well as bounds on the $\\ell_{p}$ estimation loss for 1 ≤ p ≤ 2 in the linear model when the number of variables can be much larger than the sample size.