Catalogue Search | MBRL

Regret in Online Combinatorial Optimization

by Audibert, Jean-Yves , Bubeck, Sébastien , Lugosi, Gábor in Analysis , Combinatorial optimization , Coordinate systems

2014

We address online linear optimization problems when the possible actions of the decision maker are represented by binary vectors. The regret of the decision maker is the difference between her realized loss and the minimal loss she would have achieved by picking, in hindsight, the best possible action. Our goal is to understand the magnitude of the best possible (minimax) regret. We study the problem under three different assumptions for the feedback the decision maker receives: full information, and the partial information models of the so-called \"semi-bandit\" and \"bandit\" problems. In the full information case we show that the standard exponentially weighted average forecaster is a provably suboptimal strategy. For the semi-bandit model, by combining the Mirror Descent algorithm and the INF (Implicitely Normalized Forecaster) strategy, we are able to prove the first optimal bounds. Finally, in the bandit case we discuss existing results in light of a new lower bound, and suggest a conjecture on the optimal regret in that case.

Journal Article

Share this book

Add to My Shelf

ROBUST LINEAR LEAST SQUARES REGRESSION

by Audibert, Jean-Yves , Catoni, Olivier in 62J05 , 62J07 , Branch & bound algorithms

2011

We consider the problem of robustly predicting as well as the best linear combination of d given functions in least squares regression, and variants of this problem including constraints on the parameters of the linear combination. For the ridge estimator and the ordinary least squares estimator, and their variants, we provide new risk bounds of order d/n without logarithmic factor unlike some standard results, where n is the size of the training data. We also provide a new estimator with better deviations in the presence of heavy-tailed noise. It is based on truncating differences of losses in a min-max framework and satisfies a d/n risk bound both in expectation and in deviations. The key common surprising factor of these results is the absence of exponential moment condition on the output distribution while achieving exponential deviations. All risk bounds are obtained through a PAC-Bayesian analysis on truncated differences of losses. Experimental results strongly back up our truncated min-max estimator.

Journal Article

Share this book

Add to My Shelf

Fast Learning Rates in Statistical Inference through Aggregation

by Audibert, Jean-Yves in 62G08 , 62H05 , 68T10

2009

We develop minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set g up to the smallest possible additive term, called the convergence rate. When the reference set is finite and when n denotes the size of the training data, we provide minimax convergence rates of the form $C(\\frac{log|g|}{n})^{\\upsilon}$ with tight evaluation of the positive constant C and with exact $0 < \\upsilon \\leq 1$ , the latter value depending on the convexity of the loss function and on the level of noise in the output distribution. The risk upper bounds are based on a sequential randomized algorithm, which at each step concentrates on functions having both low risk and low variance with respect to the previous step prediction function. Our analysis puts forward the links between the probabilistic and worst-case viewpoints, and allows to obtain risk bounds unachievable with the standard statistical learning approach. One of the key ideas of this work is to use probabilistic inequalities with respect to appropriate (Gibbs) distributions on the prediction function space instead of using them with respect to the distribution generating the data. The risk lower bounds are based on refinements of the Assouad lemma taking particularly into account the properties of the loss function. Our key example to illustrate the upper and lower bounds is to consider the $L_{q}$ -regression setting for which an exhaustive analysis of the convergence rates is given while q ranges in [1; +∞[.

Journal Article

Share this book

Add to My Shelf

Fast Learning Rates for Plug-In Classifiers

by Audibert, Jean-Yves , Tsybakov, Alexandre B. in 62G07 , 62G08 , 62H05

2007

It has been recently shown that, under the margin (or low noise) assumption, there exist classifiers attaining fast rates of convergence of the excess Bayes risk, that is, rates faster than $n^{-1/2}$. The work on this subject has suggested the following two conjectures: (i) the best achievable fast rate is of the order n⁻¹, and (ii) the plug-in classifiers generally converge more slowly than the classifiers based on empirical risk minimization. We show that both conjectures are not correct. In particular, we construct plug-in classifiers that can achieve not only fast, but also super-fast rates, that is, rates faster than n⁻¹. We establish minimax lower bounds showing that the obtained rates cannot be improved.

Journal Article

Share this book

Add to My Shelf

$Supplement to \Robust linear least squares regression\$

Supplement to \Robust linear least squares regression\

by Audibert, Jean-Yves , Catoni, Olivier in Mathematics , Statistics , Statistics Theory

2011

This supplementary material provides the proofs of Theorems 2.1, 2.2 and 3.1 of the article ''Robust linear least squares regression''.

Journal Article

Share this book

Add to My Shelf

Bandit View on Noisy Optimization

by Jean-Yves Audibert , Rémi Munos , Sébastien Bubeck in Applied sciences , Artificial Intelligence , Automotive engineering

2011

In this chapter, we investigate the problem of function optimization with a finite number of noisy evaluations. While at first one may think that simple repeated sampling can overcome the difficulty introduced by noisy evaluations, it is far from being an optimal strategy. Indeed, to make the best use of the evaluations, one may want to estimate the seemingly best options more precisely, while for bad options a rough estimate might be enough. This reasoning leads to non-trivial algorithms, which depend on the objective criterion that we set and on how we define the budget constraint on the number of

Book Chapter

Share this book

Add to My Shelf

Optimization for Machine Learning

by Wright, Stephen J. , Nowozin, Sebastian , Sra, Suvrit in Artificial Intelligence , Computer Science , Machine learning

2012

The interplay between optimization and machine learning is one of the most important developments in modern computational science. Optimization formulations and methods are proving to be vital in designing algorithms to extract essential knowledge from huge volumes of data. Machine learning, however, is not simply a consumer of optimization technology but a rapidly evolving field that is itself generating new optimization ideas. This book captures the state of the art of the interaction between optimization and machine learning in a way that is accessible to researchers in both fields.Optimization approaches have enjoyed prominence in machine learning because of their wide applicability and attractive theoretical properties. The increasing complexity, size, and variety of today's machine learning models call for the reassessment of existing assumptions. This book starts the process of reassessment. It describes the resurgence in novel contexts of established frameworks such as first-order methods, stochastic approximations, convex relaxations, interior-point methods, and proximal methods. It also devotes attention to newer themes such as regularized optimization, robust optimization, gradient and subgradient methods, splitting techniques, and second-order methods. Many of these techniques draw inspiration from other fields, including operations research, theoretical computer science, and subfields of optimization. The book will enrich the ongoing cross-fertilization between the machine learning community and these other fields, and within the broader optimization community.

eBook

Share this book

Add to My Shelf

Robust linear least squares regression

by Audibert, Jean-Yves , Catoni, Olivier in Bayesian analysis , Economic models , Least squares method

2012

We consider the problem of robustly predicting as well as the best linear combination of \$d\$ given functions in least squares regression, and variants of this problem including constraints on the parameters of the linear combination. For the ridge estimator and the ordinary least squares estimator, and their variants, we provide new risk bounds of order \$d/n\$ without logarithmic factor unlike some standard results, where \$n\$ is the size of the training data. We also provide a new estimator with better deviations in the presence of heavy-tailed noise. It is based on truncating differences of losses in a min--max framework and satisfies a \$d/n\$ risk bound both in expectation and in deviations. The key common surprising factor of these results is the absence of exponential moment condition on the output distribution while achieving exponential deviations. All risk bounds are obtained through a PAC-Bayesian analysis on truncated differences of losses. Experimental results strongly back up our truncated min--max estimator.

Paper

Share this book

Add to My Shelf

Fast learning rates in statistical inference through aggregation

by Audibert, Jean-Yves in Algorithms , Convergence , Convexity

2009

We develop minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set \$\\mathcal{G}\$ up to the smallest possible additive term, called the convergence rate. When the reference set is finite and when \$n\$ denotes the size of the training data, we provide minimax convergence rates of the form \$C(\\frac{\\log|\\mathcal{G}|}{n})^v\$ with tight evaluation of the positive constant \$C\$ and with exact \\(0

Paper

Share this book

Add to My Shelf

Robustness of stochastic bandit policies

by Audibert, Jean-Yves , Salomon, Antoine

2014

This paper studies the deviations of the regret in a stochastic multi-armed bandit problem. When the total number of plays n is known beforehand by the agent, Audibert et al. [2] exhibit a policy such that with probability at least 1-1/n, the regret of the policy is of order log n. They have also shown that such a property is not shared by the popular ucb1 policy of Auer et al. [3]. This work first answers an open question: it extends this negative result to any anytime policy (i.e. any policy that does not take the number of plays n into account). Another contribution of this paper is to design robust anytime policies for specific multi-armed bandit problems in which some restrictions are put on the set of possible distributions of the different arms. We also show that, for any policy (i.e. even when the number of plays n is known), the regret is of order log n with probability at least 1-1/n, so that the policy of Audibert et al. has the best possible deviation properties.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter