Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
8,682
result(s) for
"rate of convergence"
Sort by:
Prediction in Functional Linear Regression
2006
There has been substantial recent work on methods for estimating the slope function in linear regression for functional data analysis. However, as in the case of more conventional finite-dimensional regression, much of the practical interest in the slope centers on its application for the purpose of prediction, rather than on its significance in its own right. We show that the problems of slope-function estimation, and of prediction from an estimator of the slope function, have very different characteristics. While the former is intrinsically nonparametric, the latter can be either nonparametric or semi-parametric. In particular, the optimal mean-square convergence rate of predictors is n⁻¹, where n denotes sample size, if the predictand is a sufficiently smooth function. In other cases, convergence occurs at a polynomial rate that is strictly slower than n⁻¹. At the boundary between these two regimes, the mean-square convergence rate is less than n⁻¹ by only a logarithmic factor. More generally, the rate of convergence of the predicted value of the mean response in the regression model, given a particular value of the explanatory variable, is determined by a subtle interaction among the smoothness of the predictand, of the slope function in the model, and of the autocovariance function for the distribution of explanatory variables.
Journal Article
Properties of Principal Component Methods for Functional and Longitudinal Data Analysis
2006
The use of principal component methods to analyze functional data is appropriate in a wide range of different settings. In studies of \"functional data analysis,\" it has often been assumed that a sample of random functions is observed precisely, in the continuum and without noise. While this has been the traditional setting for functional data analysis, in the context of longitudinal data analysis a random function typically represents a patient, or subject, who is observed at only a small number of randomly distributed points, with nonnegligible measurement error. Nevertheless, essentially the same methods can be used in both these cases, as well as in the vast number of settings that lie between them. How is performance affected by the sampling plan? In this paper we answer that question. We show that if there is a sample of n functions, or subjects, then estimation of eigenvalues is a semiparametric problem, with root-n consistent estimators, even if only a few observations are made of each function, and if each observation is encumbered by noise. However, estimation of eigenfunctions becomes a nonparametric problem when observations are sparse. The optimal convergence rates in this case are those which pertain to more familiar function-estimation settings. We also describe the effects of sampling at regularly spaced points, as opposed to random points. In particular, it is shown that there are often advantages in sampling randomly. However, even in the case of noisy data there is a threshold sampling rate (depending on the number of functions treated) above which the rate of sampling (either randomly or regularly) has negligible impact on estimator performance, no matter whether eigenfunctions or eigenvectors are being estimated.
Journal Article
ON THE RATE OF CONVERGENCE OF FULLY CONNECTED DEEP NEURAL NETWORK REGRESSION ESTIMATES
by
Langer, Sophie
,
Kohler, Michael
in
Artificial neural networks
,
Computer architecture
,
Convergence
2021
Recent results in nonparametric regression show that deep learning, that is, neural network estimates with many hidden layers, are able to circumvent the so-called curse of dimensionality in case that suitable restrictions on the structure of the regression function hold. One key feature of the neural networks used in these results is that their network architecture has a further constraint, namely the network sparsity. In this paper, we show that we can get similar results also for least squares estimates based on simple fully connected neural networks with ReLU activation functions. Here, either the number of neurons per hidden layer is fixed and the number of hidden layers tends to infinity suitably fast for sample size tending to infinity, or the number of hidden layers is bounded by some logarithmic factor in the sample size and the number of neurons per hidden layer tends to infinity suitably fast for sample size tending to infinity. The proof is based on new approximation results concerning deep neural networks.
Journal Article
High-dimensional Bayesian inference via the unadjusted Langevin algorithm
2019
We consider in this paper the problem of sampling a high-dimensional probability distribution π having a density w.r.t the Lebesgue measure on ℝd, known up to a normalization constant
x
↦
π
x
=
e
−
U
x
/
∫
ℝ
d
e
−
U
y
d
y
. Such problem naturally occurs for example in Bayesian inference and machine learning. Under the assumption that U is continuously differentiable, ▽U is globally Lipschitz and U is strongly convex, we obtain non-asymptotic bounds for the convergence to stationarity in Wasserstein distance of order 2 and total variation distance of the sampling method based on the Euler discretization of the Langevin stochastic differential equation, for both constant and decreasing step sizes. The dependence on the dimension of the state space of these bounds is explicit. The convergence of an appropriately weighted empirical measure is also investigated and bounds for the mean square error and exponential deviation inequality are reported for functions which are measurable and bounded. An illustration to Bayesian inference for binary regression is presented to support our claims.
Journal Article
NONASYMPTOTIC CONVERGENCE ANALYSIS FOR THE UNADJUSTED LANGEVIN ALGORITHM
2017
In this paper, we study a method to sample from a target distribution π over ℝd having a positive density with respect to the Lebesgue measure, known up to a normalisation factor. This method is based on the Euler discretization of the overdamped Langevin stochastic differential equation associated with π. For both constant and decreasing step sizes in the Euler discretization, we obtain nonasymptotic bounds for the convergence to the target distribution π in total variation distance. A particular attention is paid to the dependency on the dimension d, to demonstrate the applicability of this method in the high-dimensional setting. These bounds improve and extend the results of Dalalyan.
Journal Article
Some properties of modified Szász–Mirakyan operators in polynomial spaces via the power summability method
2020
In this paper we will prove the Korovkin type theorem for modified Szász–Mirakyan operators via
-statistical convergence and the power summability method.
Also we give the rate of the convergence related to the above summability methods, and in the last section, we give a kind of Voronovskaya type theorem for
-statistical convergence and Grüss–Voronovskaya type theorem.
Journal Article
On the Convergence of Block Coordinate Descent Type Methods
2013
In this paper we study smooth convex programming problems where the decision variables vector is split into several blocks of variables. We analyze the block coordinate gradient projection method in which each iteration consists of performing a gradient projection step with respect to a certain block taken in a cyclic order. Global sublinear rate of convergence of this method is established and it is shown that it can be accelerated when the problem is unconstrained. In the unconstrained setting we also prove a sublinear rate of convergence result for the so-called alternating minimization method when the number of blocks is two. When the objective function is also assumed to be strongly convex, linear rate of convergence is established. [PUBLICATION ABSTRACT]
Journal Article
One-dimensional empirical measures, order statistics, and Kantorovich transport distances
2019
This work is devoted to the study of rates of convergence of the empirical measures \\mu_{n} = \\frac {1}{n} \\sum_{k=1}^n \\delta_{X_k}, n \\geq 1, over a sample (X_{k})_{k \\geq 1} of independent identically distributed real-valued random variables towards the common distribution \\mu in Kantorovich transport distances W_p. The focus is on finite range bounds on the expected Kantorovich distances \\mathbb{E}(W_{p}(\\mu_{n},\\mu )) or \\big [ \\mathbb{E}(W_{p}^p(\\mu_{n},\\mu )) \\big ]^1/p in terms of moments and analytic conditions on the measure \\mu and its distribution function. The study describes a variety of rates, from the standard one \\frac {1}{\\sqrt n} to slower rates, and both lower and upper-bounds on \\mathbb{E}(W_{p}(\\mu_{n},\\mu )) for fixed n in various instances. Order statistics, reduction to uniform samples and analysis of beta distributions, inverse distribution functions, log-concavity are main tools in the investigation. Two detailed appendices collect classical and some new facts on inverse distribution functions and beta distributions and their densities necessary to the investigation.
STRONG IDENTIFIABILITY AND OPTIMAL MINIMAX RATES FOR FINITE MIXTURE ESTIMATION
2018
We study the rates of estimation of finite mixing distributions, that is, the parameters of the mixture. We prove that under some regularity and strong identifiability conditions, around a given mixing distribution with m₀ components, the optimal local minimax rate of estimation of a mixing distribution with m components is n
−1/(4(m−m₀)+2). This corrects a previous paper by Chen [Ann. Statist. 23 (1995) 221–233].
By contrast, it turns out that there are estimators with a (nonuniform) pointwise rate of estimation of n
−1/2 for all mixing distributions with a finite number of components.
Journal Article
ON DEEP LEARNING AS A REMEDY FOR THE CURSE OF DIMENSIONALITY IN NONPARAMETRIC REGRESSION
by
Bauer, Benedikt
,
Kohler, Michael
in
Artificial intelligence
,
Artificial neural networks
,
Computer simulation
2019
Assuming that a smoothness condition and a suitable restriction on the structure of the regression function hold, it is shown that least squares estimates based on multilayer feedforward neural networks are able to circumvent the curse of dimensionality in nonparametric regression. The proof is based on new approximation results concerning multilayer feedforward neural networks with bounded weights and a bounded number of hidden neurons. The estimates are compared with various other approaches by using simulated data.
Journal Article