Catalogue Search | MBRL

A Multi-Resolution Approximation for Massive Spatial Datasets

by Katzfuss, Matthias in Aircraft , Algorithms , Analysis of covariance

2017

Automated sensing instruments on satellites and aircraft have enabled the collection of massive amounts of high-resolution observations of spatial fields over large spatial regions. If these datasets can be efficiently exploited, they can provide new insights on a wide variety of issues. However, traditional spatial-statistical techniques such as kriging are not computationally feasible for big datasets. We propose a multi-resolution approximation (M-RA) of Gaussian processes observed at irregular locations in space. The M-RA process is specified as a linear combination of basis functions at multiple levels of spatial resolution, which can capture spatial structure from very fine to very large scales. The basis functions are automatically chosen to approximate a given covariance function, which can be nonstationary. All computations involving the M-RA, including parameter inference and prediction, are highly scalable for massive datasets. Crucially, the inference algorithms can also be parallelized to take full advantage of large distributed-memory computing environments. In comparisons using simulated data and a large satellite dataset, the M-RA outperforms a related state-of-the-art method. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

A General Framework for Vecchia Approximations of Gaussian Processes

by Katzfuss, Matthias , Guinness, Joseph in Approximation , Datasets , Gaussian process

2021

Gaussian processes (GPs) are commonly used as models for functions, time series, and spatial fields, but they are computationally infeasible for large datasets. Focusing on the typical setting of modeling data as a GP plus an additive noise term, we propose a generalization of the Vecchia (J. Roy. Statist. Soc. Ser. B 50 (1988) 297–312) approach as a framework for GP approximations. We show that our general Vecchia approach contains many popular existing GP approximations as special cases, allowing for comparisons among the different methods within a unified framework. Representing the models by directed acyclic graphs, we determine the sparsity of the matrices necessary for inference, which leads to new insights regarding the computational properties. Based on these results, we propose a novel sparse general Vecchia approximation, which ensures computational feasibility for large spatial datasets but can lead to considerable improvements in approximation accuracy over Vecchia's original approach. We provide several theoretical results and conduct numerical comparisons. We conclude with guidelines for the use of Vecchia approximations in spatial statistics.

Journal Article

Share this book

Add to My Shelf

Ensemble Kalman Filter Updates Based on Regularized Sparse Inverse Cholesky Factors

by Boyles, Will , Katzfuss, Matthias in Approximation , Covariance matrix , Data assimilation

2021

The ensemble Kalman filter (EnKF) is a popular technique for data assimilation in high-dimensional nonlinear state-space models. The EnKF represents distributions of interest by an ensemble, which is a form of dimension reduction that enables straightforward forecasting even for complicated and expensive evolution operators. However, the EnKF update step involves estimation of the forecast covariance matrix based on the (often small) ensemble, which requires regularization. Many existing regularization techniques rely on spatial localization, which may ignore long-range dependence. Instead, our proposed approach assumes a sparse Cholesky factor of the inverse covariance matrix, and the nonzero Cholesky entries are further regularized. The resulting method is highly flexible and computationally scalable. In our numerical experiments, our approach was more accurate and less sensitive to misspecification of tuning parameters than tapering-based localization.

Journal Article

Share this book

Add to My Shelf

A Bayesian Adaptive Ensemble Kalman Filter for Sequential State and Parameter Estimation

by Stroud, Jonathan R. , Wikle, Christopher K. , Katzfuss, Matthias

2018

This paper proposes new methodology for sequential state and parameter estimation within the ensemble Kalman filter. The method is fully Bayesian and propagates the joint posterior distribution of states and parameters over time. To implement the method, the authors consider three representations of the marginal posterior distribution of the parameters: a grid-based approach, a Gaussian approximation, and a sequential importance sampling (SIR) approach with kernel resampling. In contrast to existing online parameter estimation algorithms, the new method explicitly accounts for parameter uncertainty and provides a formal way to combine information about the parameters from data at different time periods. The method is illustrated and compared to existing approaches using simulated and real data.

Journal Article

Share this book

Add to My Shelf

A Case Study Competition Among Methods for Analyzing Large Spatial Data

by Nychka, Douglas W. , Gerber, Florian , Guhaniyogi, Rajarshi in Agriculture , Big data , Biostatistics

2019

The Gaussian process is an indispensable tool for spatial data analysts. The onset of the “big data” era, however, has lead to the traditional Gaussian process being computationally infeasible for modern spatial data. As such, various alternatives to the full Gaussian process that are more amenable to handling big spatial data have been proposed. These modern methods often exploit low-rank structures and/or multi-core and multi-threaded computing environments to facilitate computation. This study provides, first, an introductory overview of several methods for analyzing large spatial data. Second, this study describes the results of a predictive competition among the described methods as implemented by different groups with strong expertise in the methodology. Specifically, each research group was provided with two training datasets (one simulated and one observed) along with a set of prediction locations. Each group then wrote their own implementation of their method to produce predictions at the given location and each was subsequently run on a common computing environment. The methods were then compared in terms of various predictive diagnostics.

Journal Article

Share this book

Add to My Shelf

A CLASS OF MULTI-RESOLUTION APPROXIMATIONS FOR LARGE SPATIAL DATASETS

by Gong, Wenlong , Katzfuss, Matthias

2020

Gaussian processes are popular and exible models for spatial, temporal, and functional data, but they are computationally infeasible for large data sets. We discuss Gaussian-process approximations that use basis functions at multiple resolutions to achieve fast inference and that can (approximately) represent any spatial covariance structure. We consider two special cases of this multi-resolution approximation framework, namely a taper version and a domain-partitioning (block) version. We describe theoretical properties and inference procedures, and study the computational complexity of the methods. Numerical comparisons and an application to satellite data are also provided.

Journal Article

Share this book

Add to My Shelf

Spatial Surface Reflectance Retrievals for Visible/Shortwave Infrared Remote Sensing via Gaussian Process Priors

by Braverman, Amy , Katzfuss, Matthias , Hobbs, Jonathan in Aerosols , Algorithms , Atmosphere

2022

Remote Visible/Shortwave Infrared (VSWIR) imaging spectroscopy is a powerful tool for measuring the composition of Earth’s surface over wide areas. This compositional information is captured by the spectral surface reflectance, where distinct shapes and absorption features indicate the chemical, bio- and geophysical properties of the materials in the scene. Estimating this surface reflectance requires removing the influence of atmospheric distortions caused by water vapor and particles. Traditionally reflectance is estimated by considering one location at a time, disentangling atmospheric and surface effects independently at all locations in a scene. However, this approach does not take advantage of spatial correlations between contiguous pixels. We propose an extension to a common Bayesian approach, Optimal Estimation, by introducing atmospheric correlations into the multivariate Gaussian prior. We show how this approach can be implemented as a small change to the traditional estimation procedure, thus limiting the additional computational burden. We demonstrate a simple version of the technique using simulations and multiple airborne radiance data sets. Our results show that the predicted atmospheric fields are smoother and more realistic than independent inversions given the assumption of spatial correlation and may reduce bias in the surface reflectance retrievals compared to post-process smoothing.

Journal Article

Share this book

Add to My Shelf

Spatial Retrievals of Atmospheric Carbon Dioxide from Satellite Observations

by Katzfuss, Matthias , Hobbs, Jonathan , Zilber, Daniel in Algorithms , Atmospheric aerosols , Bayesian analysis

2021

Modern remote-sensing retrievals often invoke a Bayesian approach to infer atmospheric properties from observed radiances. In this approach, plausible mean states and variability for the quantities of interest are encoded in a prior distribution. Recent developments have devised prior assumptions for the correlation among atmospheric constituents and across observing locations. This work formulates a spatial statistical framework for simultaneous multi-footprint retrievals of carbon dioxide (CO2) with application to the Orbiting Carbon Observatory-2/3 (OCO-2/3). Formally, the retrieval state vector is extended to include atmospheric and surface conditions at many footprints in a small region, and a prior distribution that assumes spatial correlation across these locations is assumed. This spatial prior allows the length-scale, or range, of spatial correlation to vary between different elements of the state vector. Various single- and multi-footprint retrievals are compared in a simulation study. A spatial prior that also includes relatively large prior variances for CO2 results in posterior inferences that most accurately represent the true state and that reduce the correlation in retrieval error across locations.

Journal Article

Share this book

Add to My Shelf

Vecchia Approximations of Gaussian-Process Predictions

by Gong, Wenlong , Katzfuss, Matthias , Guinness, Joseph in Agriculture , Analysis , artificial intelligence

2020

Gaussian processes (GPs) are highly flexible function estimators used for geospatial analysis, nonparametric regression, and machine learning, but they are computationally infeasible for large datasets. Vecchia approximations of GPs have been used to enable fast evaluation of the likelihood for parameter inference. Here, we study Vecchia approximations of spatial predictions at observed and unobserved locations, including obtaining joint predictive distributions at large sets of locations. We consider a general Vecchia framework for GP predictions, which contains some novel and some existing special cases. We study the accuracy and computational properties of these approaches theoretically and numerically, proving that our new methods exhibit linear computational complexity in the total number of spatial locations. We show that certain choices within the framework can have a strong effect on uncertainty quantification and computational cost, which leads to specific recommendations on which methods are most suitable for various settings. We also apply our methods to a satellite dataset of chlorophyll fluorescence, showing that the new methods are faster or more accurate than existing methods and reduce unrealistic artifacts in prediction maps.

Journal Article

Share this book

Add to My Shelf

Understanding the Ensemble Kalman Filter

by Stroud, Jonathan R. , Wikle, Christopher K. , Katzfuss, Matthias in Algorithms , Application , Bayesian inference

2016

The ensemble Kalman filter (EnKF) is a computational technique for approximate inference in state-space models. In typical applications, the state vectors are large spatial fields that are observed sequentially over time. The EnKF approximates the Kalman filter by representing the distribution of the state with an ensemble of draws from that distribution. The ensemble members are updated based on newly available data by shifting instead of reweighting, which allows the EnKF to avoid the degeneracy problems of reweighting-based algorithms. Taken together, the ensemble representation and shifting-based updates make the EnKF computationally feasible even for extremely high-dimensional state spaces. The EnKF is successfully used in data-assimilation applications with tens of millions of dimensions. While it implicitly assumes a linear Gaussian state-space model, it has also turned out to be remarkably robust to deviations from these assumptions in many applications. Despite its successes, the EnKF is largely unknown in the statistics community. We aim to change that with the present article, and to entice more statisticians to work on this topic.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter