Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
423 result(s) for "Zhou, Harrison"
Sort by:
RATE-OPTIMAL GRAPHON ESTIMATION
Network analysis is becoming one of the most active research areas in statistics. Significant advances have been made recently on developing theories, methodologies and algorithms for analyzing networks. However, there has been little fundamental study on optimal estimation. In this paper, we establish optimal rate of convergence for graphon estimation. For the stochastic block model with k clusters, we show that the optimal rate under the mean squared error is n⁻¹ log k + k²/n². The minimax upper bound improves the existing results in literature through a technique of solving a quadratic equation. When $k\\, \\leqslant \\,\\sqrt {n\\,\\log \\,n} $, as the number of the cluster k grows, the minimax rate grows slowly with only a logarithmic order n⁻¹ log k. A key step to establish the lower bound is to construct a novel subset of the parameter space and then apply Fano's lemma, from which we see a clear distinction of the non-parametric graphon estimation problem from classical nonparametric regression, due to the lack of identifiability of the order of nodes in exchangeable random graph models. As an immediate application, we consider nonparametric graphon estimation in a Holder class with smoothness α. When the smoothness α ≥ 1, the optimal rate of convergence is n⁻¹ log n, independent of α, while for α ∈ (0, 1), the rate is n-2α/(α+1), which is, to our surprise, identical to the classical nonparametric rate.
COMMUNITY DETECTION IN DEGREE-CORRECTED BLOCK MODELS
Community detection is a central problem of network data analysis. Given a network, the goal of community detection is to partition the network nodes into a small number of clusters, which could often help reveal interesting structures. The present paper studies community detection in Degree-Corrected Block Models (DCBMs). We first derive asymptotic minimax risks of the problem for a misclassification proportion loss under appropriate conditions. The minimax risks are shown to depend on degree-correction parameters, community sizes and average within and between community connectivities in an intuitive and interpretable way. In addition, we propose a polynomial time algorithm to adaptively perform consistent and even asymptotically optimal community detection in DCBMs.
OPTIMALITY OF SPECTRAL CLUSTERING IN THE GAUSSIAN MIXTURE MODEL
Spectral clustering is one of the most popular algorithms to group high-dimensional data. It is easy to implement and computationally efficient. Despite its popularity and successful applications, its theoretical properties have not been fully understood. In this paper, we show that spectral clustering is minimax optimal in the Gaussian mixture model with isotropic covariance matrix, when the number of clusters is fixed and the signal-to-noise ratio is large enough. Spectral gap conditions are widely assumed in the literature to analyze spectral clustering. On the contrary, these conditions are not needed to establish optimality of spectral clustering in this paper.
THEORETICAL AND COMPUTATIONAL GUARANTEES OF MEAN FIELD VARIATIONAL INFERENCE FOR COMMUNITY DETECTION
The mean field variational Bayes method is becoming increasingly popular in statistics and machine learning. Its iterative coordinate ascent variational inference algorithm has been widely applied to large scale Bayesian inference. See Blei et al. (2017) for a recent comprehensive review. Despite the popularity of the mean field method, there exist remarkably little fundamental theoretical justifications. To the best of our knowledge, the iterative algorithm has never been investigated for any high-dimensional and complex model. In this paper, we study the mean field method for community detection under the stochastic block model. For an iterative batch coordinate ascent variational inference algorithm, we show that it has a linear convergence rate and converges to the minimax rate within log n iterations. This complements the results of Bickel et al. (2013) which studied the global minimum of the mean field variational Bayes and obtained asymptotic normal estimation of global model parameters. In addition, we obtain similar optimality results for Gibbs sampling and an iterative procedure to calculate maximum likelihood estimation, which can be of independent interest.
SPARSE CCA: ADAPTIVE ESTIMATION AND COMPUTATIONAL BARRIERS
Canonical correlation analysis is a classical technique for exploring the relationship between two sets of variables. It has important applications in analyzing high dimensional datasets originated from genomics, imaging and other fields. This paper considers adaptive minimax and computationally tractable estimation of leading sparse canonical coefficient vectors in high dimensions. Under a Gaussian canonical pair model, we first establish separate minimax estimation rates for canonical coefficient vectors of each set of random variables under no structural assumption on marginal covariance matrices. Second, we propose a computationally feasible estimator to attain the optimal rates adaptively under an additional sample size condition. Finally, we show that a sample size condition of this kind is needed for any randomized polynomial-time estimator to be consistent, assuming hardness of certain instances of the planted clique detection problem. As a byproduct, we obtain the first computational lower bounds for sparse PCA under the Gaussian single spiked covariance model.
ESTIMATING SPARSE PRECISION MATRIX: OPTIMAL RATES OF CONVERGENCE AND ADAPTIVE ESTIMATION
Precision matrix is of significant importance in a wide range of applications in multivariate analysis. This paper considers adaptive minimax estimation of sparse precision matrices in the high dimensional setting. Optimal rates of convergence are established for a range of matrix norm losses. A fully data driven estimator based on adaptive constrained ℓ₁ minimization is proposed and its rate of convergence is obtained over a collection of parameter spaces. The estimator, called ACLIME, is easy to implement and performs well numerically. A major step in establishing the minimax rate of convergence is the derivation of a rate-sharp lower bound. A \"two-directional\" lower bound technique is applied to obtain the minimax lower bound. The upper and lower bounds together yield the optimal rates of convergence for sparse precision matrix estimation and show that the ACLIME estimator is adaptively minimax rate optimal for a collection of parameter spaces and a range of matrix norm losses simultaneously.
MINIMAX RATES OF COMMUNITY DETECTION IN STOCHASTIC BLOCK MODELS
Recently, network analysis has gained more and more attention in statistics, as well as in computer science, probability and applied mathematics. Community detection for the stochastic block model (SBM) is probably the most studied topic in network analysis. Many methodologies have been proposed. Some beautiful and significant phase transition results are obtained in various settings. In this paper, we provide a general minimax theory for community detection. It gives minimax rates of the mis-match ratio for a wide rage of settings including homogeneous and inhomogeneous SBMs, dense and sparse networks, finite and growing number of communities. The minimax rates are exponential, different from polynomial rates we often see in statistical literature. An immediate consequence of the result is to establish threshold phenomenon for strong consistency (exact recovery) as well as weak consistency (partial recovery). We obtain the upper bound by a range of penalized likelihood-type approaches. The lower bound is achieved by a novel reduction from a global mis-match ratio to a local clustering problem for one node through an exchangeability property.
OPTIMAL RATES OF CONVERGENCE FOR SPARSE COVARIANCE MATRIX ESTIMATION
This paper considers estimation of sparse covariance matrices and establishes the optimal rate of convergence under a range of matrix operator norm and Bregman divergence losses. A major focus is on the derivation of a rate sharp minimax lower bound. The problem exhibits new features that are significantly different from those that occur in the conventional nonparametric function estimation problems. Standard techniques fail to yield good results, and new tools are thus needed. We first develop a lower bound technique that is particularly well suited for treating \"two-directional\" problems such as estimating sparse covariance matrices. The result can be viewed as a generalization of Le Cam's method in one direction and Assouad's Lemma in another. This lower bound technique is of independent interest and can be used for other matrix estimation problems. We then establish a rate sharp minimax lower bound for estimating sparse covariance matrices under the spectral norm by applying the general lower bound technique. A thresholding estimator is shown to attain the optimal rate of convergence under the spectral norm. The results are then extended to the general matrix l w operator norms for 1 ≤ w ≤ ∞. In addition, we give a unified result on the minimax rate of convergence for sparse covariance matrix estimation under a class of Bregman divergence losses.
ASYMPTOTIC NORMALITY AND OPTIMALITIES IN ESTIMATION OF LARGE GAUSSIAN GRAPHICAL MODELS
The Gaussian graphical model, a popular paradigm for studying relationship among variables in a wide range of applications, has attracted great attention in recent years. This paper considers a fundamental question: When is it possible to estimate low-dimensional parameters at parametric squareroot rate in a large Gaussian graphical model? A novel regression approach is proposed to obtain asymptotically efficient estimation of each entry of a precision matrix under a sparseness condition relative to the sample size. When the precision matrix is not sufficiently sparse, or equivalently the sample size is not sufficiently large, a lower bound is established to show that it is no longer possible to achieve the parametric rate in the estimation of each entry. This lower bound result, which provides an answer to the delicate sample size question, is established with a novel construction of a subset of sparse precision matrices in an application of Le Cam's lemma. Moreover, the proposed estimator is proven to have optimal convergence rate when the parametric rate cannot be achieved, under a minimal sample requirement. The proposed estimator is applied to test the presence of an edge in the Gaussian graphical model or to recover the support of the entire model, to obtain adaptive rate-optimal estimation of the entire precision matrix as measured by the matrix lq operator norm and to make inference in latent variables in the graphical model. All of this is achieved under a sparsity condition on the precision matrix and a side condition on the range of its spectrum. This significantly relaxes the commonly imposed uniform signal strength condition on the precision matrix, irrepresentability condition on the Hessian tensor operator of the covariance matrix or the l₁ constraint on the precision matrix. Numerical results confirm our theoretical findings. The ROC curve of the proposed algorithm, Asymptotic Normal Thresholding (ANT), for support recovery significantly outperforms that of the popular GLasso algorithm.
OPTIMAL RATES OF CONVERGENCE FOR COVARIANCE MATRIX ESTIMATION
Covariance matrix plays a central role in multivariate statistical analysis. Significant advances have been made recently on developing both theory and methodology for estimating large covariance matrices. However, a minimax theory has yet been developed. In this paper we establish the optimal rates of convergence for estimating the convariance matrix under both operator norm and Frobenius norm. It is shown that optimal procedures under the two norms are different and consequently matrix estimation under the operator norm is fundamentally different from vector estimation. The minimax upper bound is obtained by constructing a special class of tapering estimators and by studying their risk properties. A key step in obtaining the optimal rate of convergence is the derivation of the minimax lower bound. The technical analysis requires new ideas that are quite different from those used in the more conventional function/sequence estimation problems.