Catalogue Search | MBRL

by Kolar, Mladen , Barber, Rina Foygel in Complex variables , Computer simulation , Confidence intervals

2018

Understanding complex relationships between random variables is of fundamental importance in high-dimensional statistics, with numerous applications in biological and social sciences. Undirected graphical models are often used to represent dependencies between random variables, where an edge between two random variables is drawn if they are conditionally dependent given all the other measured variables. A large body of literature exists on methods that estimate the structure of an undirected graphical model, however, little is known about the distributional properties of the estimators beyond the Gaussian setting. In this paper, we focus on inference for edge parameters in a high-dimensional transelliptical model, which generalizes Gaussian and nonparanormal graphical models. We propose ROCKET, a novel procedure for estimating parameters in the latent inverse covariance matrix. We establish asymptotic normality of ROCKET in an ultra high-dimensional setting under mild assumptions, without relying on oracle model selection results. ROCKET requires the same number of samples that are known to be necessary for obtaining a √n consistent estimator of an element in the precision matrix under a Gaussian model. Hence, it is an optimal estimator under a much larger family of distributions. The result hinges on a tight control of the sparse spectral norm of the nonparametric Kendall’s tau estimator of the correlation matrix, which is of independent interest. Empirically, ROCKET outperforms the nonparanormal and Gaussian models in terms of achieving accurate inference on simulated data. We also compare the three methods on real data (daily stock returns), and find that the ROCKET estimator is the only method whose behavior across subsamples agrees with the distribution predicted by the theory.

Journal Article

Share this book

Add to My Shelf

Sparse precision matrix estimation via lasso penalized D-trace loss

by ZOU, HUI , ZHANG, TENG in Algorithms , Comparative analysis , Coordinate systems

2014

We introduce a constrained empirical loss minimization framework for estimating highdimensional sparse precision matrices and propose a new loss function, called the D-trace loss, for that purpose. A novel sparse precision matrix estimator is defined as the minimizer of the lasso penalized D-trace loss under a positive-definiteness constraint. Under a new irrepresentability condition, the lasso penalized D-trace estimator is shown to have the sparse recovery property. Examples demonstrate that the new condition can hold in situations where the irrepresentability condition for the lasso penalized Gaussian likelihood estimator fails. We establish rates of convergence for the new estimator in the elementwise maximum, Frobenius and operator norms. We develop a very efficient algorithm based on alternating direction methods for computing the proposed estimator. Simulated and real data are used to demonstrate the computational efficiency of our algorithm and the finite-sample performance of the new estimator. The lasso penalized D-trace estimator is found to compare favourably with the lasso penalized Gaussian likelihood estimator.

Journal Article

Share this book

Add to My Shelf

convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees

by Khare, Kshitij , Rajaratnam, Bala , Oh, Sang‐Yun in Analysis , Analysis of covariance , Breast cancer

2015

Sparse high dimensional graphical model selection is a topic of much interest in modern day statistics. A popular approach is to apply l₁‐penalties to either parametric likelihoods, or regularized regression/pseudolikelihoods, with the latter having the distinct advantage that they do not explicitly assume Gaussianity. As none of the popular methods proposed for solving pseudolikelihood‐based objective functions have provable convergence guarantees, it is not clear whether corresponding estimators exist or are even computable, or if they actually yield correct partial correlation graphs. We propose a new pseudolikelihood‐based graphical model selection method that aims to overcome some of the shortcomings of current methods, but at the same time retain all their respective strengths. In particular, we introduce a novel framework that leads to a convex formulation of the partial covariance regression graph problem, resulting in an objective function comprised of quadratic forms. The objective is then optimized via a co‐ordinatewise approach. The specific functional form of the objective function facilitates rigorous convergence analysis leading to convergence guarantees; an important property that cannot be established by using standard results, when the dimension is larger than the sample size, as is often the case in high dimensional applications. These convergence guarantees ensure that estimators are well defined under very general conditions and are always computable. In addition, the approach yields estimators that have good large sample properties and also respect symmetry. Furthermore, application to simulated and real data, timing comparisons and numerical convergence is demonstrated. We also present a novel unifying framework that places all graphical pseudolikelihood methods as special cases of a more general formulation, leading to important insights.

Journal Article

Share this book

Add to My Shelf

GEMINI: GRAPH ESTIMATION WITH MATRIX VARIATE NORMAL INSTANCES

by Zhou, Shuheng in 62F12 , 62F30 , Correlations

2014

Undirected graphs can be used to describe matrix variate distributions. In this paper, we develop new methods for estimating the graphical structures and underlying parameters, namely, the row and column covariance and inverse covariance matrices from the matrix variate data. Under sparsity conditions, we show that one is able to recover the graphs and covariance matrices with a single random matrix from the matrix variate normal distribution. Our method extends, with suitable adaptation, to the general setting where replicates are available. We establish consistency and obtain the rates of convergence in the operator and the Frobenius norm. We show that having replicates will allow one to estimate more complicated graphical structures and achieve faster rates of convergence. We provide simulation evidence showing that we can recover graphical structures as well as estimating the precision matrices, as predicted by theory.

Journal Article

Share this book

Add to My Shelf

HIGH-DIMENSIONAL STRUCTURE ESTIMATION IN ISING MODELS: LOCAL SEPARATION CRITERION

by Huang, Furong , Willsky, Alan S. , Anandkumar, Animashree in 05C80 , 62H12 , Algorithms

2012

We consider the problem of high-dimensional Ising (graphical) model selection. We propose a simple algorithm for structure estimation based on the thresholding of the empirical conditional variation distances. We introduce a novel criterion for tractable graph families, where this method is efficient, based on the presence of sparse local separators between node pairs in the underlying graph. For such graphs, the proposed algorithm has a sample complexity of $n = \\Omega (J_{\\min }^{ - 2}\\log p)$ , where p is the number of variables, and J min is the minimum (absolute) edge potential in the model. We also establish nonasymptotic necessary and sufficient conditions for structure estimation.

Journal Article

Share this book

Add to My Shelf

Big data analysis for financial risk management

by Giudici, Paolo , Cerchiello, Paola in Bayesian analysis , Big Data , Communications Engineering

2016

A very important area of financial risk management is systemic risk modelling, which concerns the estimation of the interrelationships between financial institutions, with the aim of establishing which of them are more central and, therefore, more contagious/subject to contagion. The aim of this paper is to develop a novel systemic risk model. A model that, differently from existing ones, employs not only the information contained in financial market prices, but also big data coming from financial tweets. From a methodological viewpoint, the novelty of our paper is the estimation of systemic risk models using two different data sources: financial markets and financial tweets, and a proposal to combine them, using a Bayesian approach. From an applied viewpoint, we present the first systemic risk model based on big data, and show that such a model can shed further light on the interrelationships between financial institutions.

Journal Article

Share this book

Add to My Shelf

LEARNING LOOPY GRAPHICAL MODELS WITH LATENT VARIABLES: EFFICIENT METHODS AND GUARANTEES

by Valluvan, Ragupathyraj , Anandkumar, Animashree in 05C12 , 62H12 , Datasets

2013

The problem of structure estimation in graphical models with latent variables is considered. We characterize conditions for tractable graph estimation and develop efficient methods with provable guarantees. We consider models where the underlying Markov graph is locally tree-like, and the model is in the regime of correlation decay. For the special case of the Ising model, the number of samples n required for structural consistency of our method scales as $\\mathrm{n}=\\mathrm{\\Omega }({\\mathrm{\\theta }}_{\\mathrm{min}}^{-\\mathrm{\\delta }\\mathrm{\\eta }(\\mathrm{\\eta }+1)-2}\\mathrm{log}\\mathrm{p})$ , where p is the number of variables, θ min is the minimum edge potential, δ is the depth (i.e., distance from a hidden node to the nearest observed nodes), and η is a parameter which depends on the bounds on node and edge potentials in the Ising model. Necessary conditions for structural consistency under any algorithm are derived and our method nearly matches the lower bound on sample requirements. Further, the proposed method is practical to implement and provides flexibility to control the number of latent variables and the cycle lengths in the output graph.

Journal Article

Share this book

Add to My Shelf

A convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees

by Oh, Sang-Yun , Khare, Kshitij , Rajaratnam, Bala in Convergence guarantee , Gene regulatory network , Generalized pseudo-likelihood

2014

Sparse high dimensional graphical model selection is a topic of much interest in modern day statistics. A popular approach is to apply l1-penalties to either parametric likelihoods, or regularized regression/pseudolikelihoods, with the latter having the distinct advantage that they do not explicitly assume Gaussianity. As none of the popular methods proposed for solving pseudolikelihood-based objective functions have provable convergence guarantees, it is not clear whether corresponding estimators exist or are even computable, or if they actually yield correct partial correlation graphs. We propose a new pseudolikelihood-based graphical model selection method that aims to overcome some of the shortcomings of current methods, but at the same time retain all their respective strengths. In particular, we introduce a novel framework that leads to a convex formulation of the partial covariance regression graph problem, resulting in an objective function comprised of quadratic forms. The objective is then optimized via a coordinate wise approach. The specific functional form of the objective function facilitates rigorous convergence analysis leading to convergence guarantees; an important property that cannot be established by using standard results, when the dimension is larger than the sample size, as is often the case in high dimensional applications. These convergence guarantees ensure that estimators are well defined under very general conditions and are always computable. In addition, the approach yields estimators that have good large sample properties and also respect symmetry. Furthermore, application to simulated and real data, timing comparisons and numerical convergence is demonstrated. We also present a novel unifying framework that places all graphical pseudolikelihood methods as special cases of a more general formulation, leading to important insights.

Journal Article

Share this book

Add to My Shelf

Model selection for Gaussian concentration graphs

by Drton, Mathias , Perlman, Michael D. in Applications , Biology, psychology, social sciences , Combinatorics

2004

A multivariate Gaussian graphical Markov model for an undirected graph G, also called a covariance selection model or concentration graph model, is defined in terms of the Markov properties, i.e. conditional independences associated with G, which in turn are equivalent to specified zeros among the set of pairwise partial correlation coefficients. By means of Fisher's z‐transformation and Šidák's correlation inequality, conservative simultaneous confidence intervals for the entire set of partial correlations can be obtained, leading to a simple method for model selection that controls the overall error rate for incorrect edge inclusion. The simultaneous p‐values corresponding to the partial correlations are partitioned into three disjoint sets, a significant set S, an indeterminate set I and a nonsignificant set N. Our model selection method selects two graphs, a graph ĜSI whose edges correspond to the set S∪I, and a more conservative graph ĜS whose edges correspond to S only. Similar considerations apply to covariance graph models, which are defined in terms of marginal independence rather than conditional independence. The method is applied to some well‐known examples and to simulated data.

Journal Article

Share this book

Add to My Shelf

On the impact of contaminations in graphical Gaussian models

by Pacillo, Simona , Gottard, Anna in Concentration graph models , Contaminants , Graphical models selection

2007

This paper analyzes the impact of some kinds of contaminant on model selection in graphical Gaussian models. We investigate four different kinds of contaminants, in order to consider the effect of gross errors, model deviations, and model misspecification. The aim of the work is to assess against which kinds of contaminant a model selection procedure for graphical Gaussian models has a more robust behavior. The analysis is based on simulated data. The simulation study shows that relatively few contaminated observations in even just one of the variables can have a significant impact on correct model selection, especially when the contaminated variable is a node in a separating set of the graph. [PUBLICATION ABSTRACT]

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter