Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
82
result(s) for
"Musco, Christopher"
Sort by:
Low-memory Krylov subspace methods for optimal rational matrix function approximation
by
Musco, Christopher
,
Greenbaum, Anne
,
Chen, Tyler
in
Algorithms
,
Approximation
,
Mathematical analysis
2023
We describe a Lanczos-based algorithm for approximating the product of a rational matrix function with a vector. This algorithm, which we call the Lanczos method for optimal rational matrix function approximation (Lanczos-OR), returns the optimal approximation from a given Krylov subspace in a norm depending on the rational function's denominator, and can be computed using the information from a slightly larger Krylov subspace. We also provide a low-memory implementation which only requires storing a number of vectors proportional to the denominator degree of the rational function. Finally, we show that Lanczos-OR can be used to derive algorithms for computing other matrix functions, including the matrix sign function and quadrature based rational function approximations. In many cases, it improves on the approximation quality of prior approaches, including the standard Lanczos method, with little additional computational overhead.
Provably Accurate Shapley Value Estimation via Leverage Score Sampling
2025
Originally introduced in game theory, Shapley values have emerged as a central tool in explainable machine learning, where they are used to attribute model predictions to specific input features. However, computing Shapley values exactly is expensive: for a general model with \\(n\\) features, \\(O(2^n)\\) model evaluations are necessary. To address this issue, approximation algorithms are widely used. One of the most popular is the Kernel SHAP algorithm, which is model agnostic and remarkably effective in practice. However, to the best of our knowledge, Kernel SHAP has no strong non-asymptotic complexity guarantees. We address this issue by introducing Leverage SHAP, a light-weight modification of Kernel SHAP that provides provably accurate Shapley value estimates with just \\(O(n\\log n)\\) model evaluations. Our approach takes advantage of a connection between Shapley value estimation and agnostic active learning by employing leverage score sampling, a powerful regression tool. Beyond theoretical guarantees, we show that Leverage SHAP consistently outperforms even the highly optimized implementation of Kernel SHAP available in the ubiquitous SHAP library [Lundberg & Lee, 2017].
Provably Accurate Shapley Value Estimation via Leverage Score Sampling
2024
Originally introduced in game theory, Shapley values have emerged as a central tool in explainable machine learning, where they are used to attribute model predictions to specific input features. However, computing Shapley values exactly is expensive: for a general model with \\(n\\) features, \\(O(2^n)\\) model evaluations are necessary. To address this issue, approximation algorithms are widely used. One of the most popular is the Kernel SHAP algorithm, which is model agnostic and remarkably effective in practice. However, to the best of our knowledge, Kernel SHAP has no strong non-asymptotic complexity guarantees. We address this issue by introducing Leverage SHAP, a light-weight modification of Kernel SHAP that provides provably accurate Shapley value estimates with just \\(O(n\\log n)\\) model evaluations. Our approach takes advantage of a connection between Shapley value estimation and agnostic active learning by employing leverage score sampling, a powerful regression tool. Beyond theoretical guarantees, we show that Leverage SHAP consistently outperforms even the highly optimized implementation of Kernel SHAP available in the ubiquitous SHAP library [Lundberg & Lee, 2017].
Stability of the Lanczos Method for Matrix Function Approximation
by
Musco, Christopher
,
Sidford, Aaron
,
Musco, Cameron
in
Approximation
,
Arithmetic
,
Chebyshev approximation
2024
The ubiquitous Lanczos method can approximate \\(f(A)x\\) for any symmetric \\(n \\times n\\) matrix \\(A\\), vector \\(x\\), and function \\(f\\). In exact arithmetic, the method's error after \\(k\\) iterations is bounded by the error of the best degree-\\(k\\) polynomial uniformly approximating \\(f(x)\\) on the range \\([\\lambda_{min}(A), \\lambda_{max}(A)]\\). However, despite decades of work, it has been unclear if this powerful guarantee holds in finite precision. We resolve this problem, proving that when \\(\\max_{x \\in [\\lambda_{min}, \\lambda_{max}]}|f(x)| \\le C\\), Lanczos essentially matches the exact arithmetic guarantee if computations use roughly \\(\\log(nC\\|A\\|)\\) bits of precision. Our proof extends work of Druskin and Knizhnerman [DK91], leveraging the stability of the classic Chebyshev recurrence to bound the stability of any polynomial approximating \\(f(x)\\). We also study the special case of \\(f(A) = A^{-1}\\), where stronger guarantees hold. In exact arithmetic Lanczos performs as well as the best polynomial approximating \\(1/x\\) at each of \\(A\\)'s eigenvalues, rather than on the full eigenvalue range. In seminal work, Greenbaum gives an approach to extending this bound to finite precision: she proves that finite precision Lanczos and the related CG method match any polynomial approximating \\(1/x\\) in a tiny range around each eigenvalue [Gre89]. For \\(A^{-1}\\), this bound appears stronger than ours. However, we exhibit matrices with condition number \\(\\kappa\\) where exact arithmetic Lanczos converges in \\(polylog(\\kappa)\\) iterations, but Greenbaum's bound predicts \\(\\Omega(\\kappa^{1/5})\\) iterations. It thus cannot offer significant improvement over the \\(O(\\kappa^{1/2})\\) bound achievable via our result. Our analysis raises the question of if convergence in less than \\(poly(\\kappa)\\) iterations can be expected in finite precision, even for matrices with clustered, skewed, or otherwise favorable eigenvalue distributions.
Sublinear Time Spectral Density Estimation
by
Musco, Christopher
,
Braverman, Vladimir
,
Krishnan, Aditya
in
Algorithms
,
Approximation
,
Chebyshev approximation
2022
We present a new sublinear time algorithm for approximating the spectral density (eigenvalue distribution) of an \\(n\\times n\\) normalized graph adjacency or Laplacian matrix. The algorithm recovers the spectrum up to \\(\\epsilon\\) accuracy in the Wasserstein-1 distance in \\(O(n\\cdot \\text{poly}(1/\\epsilon))\\) time given sample access to the graph. This result compliments recent work by David Cohen-Steiner, Weihao Kong, Christian Sohler, and Gregory Valiant (2018), which obtains a solution with runtime independent of \\(n\\), but exponential in \\(1/\\epsilon\\). We conjecture that the trade-off between dimension dependence and accuracy is inherent. Our method is simple and works well experimentally. It is based on a Chebyshev polynomial moment matching method that employees randomized estimators for the matrix trace. We prove that, for any Hermitian \\(A\\), this moment matching method returns an \\(\\epsilon\\) approximation to the spectral density using just \\(O({1}/{\\epsilon})\\) matrix-vector products with \\(A\\). By leveraging stability properties of the Chebyshev polynomial three-term recurrence, we then prove that the method is amenable to the use of coarse approximate matrix-vector products. Our sublinear time algorithm follows from combining this result with a novel sampling algorithm for approximating matrix-vector products with a normalized graph adjacency matrix. Of independent interest, we show a similar result for the widely used \\emph{kernel polynomial method} (KPM), proving that this practical algorithm nearly matches the theoretical guarantees of our moment matching method. Our analysis uses tools from Jackson's seminal work on approximation with positive polynomial kernels.
Benchmarking Estimators for Natural Experiments: A Novel Dataset and a Doubly Robust Algorithm
2024
Estimating the effect of treatments from natural experiments, where treatments are pre-assigned, is an important and well-studied problem. We introduce a novel natural experiment dataset obtained from an early childhood literacy nonprofit. Surprisingly, applying over 20 established estimators to the dataset produces inconsistent results in evaluating the nonprofit's efficacy. To address this, we create a benchmark to evaluate estimator accuracy using synthetic outcomes, whose design was guided by domain experts. The benchmark extensively explores performance as real world conditions like sample size, treatment correlation, and propensity score accuracy vary. Based on our benchmark, we observe that the class of doubly robust treatment effect estimators, which are based on simple and intuitive regression adjustment, generally outperform other more complicated estimators by orders of magnitude. To better support our theoretical understanding of doubly robust estimators, we derive a closed form expression for the variance of any such estimator that uses dataset splitting to obtain an unbiased estimate. This expression motivates the design of a new doubly robust estimator that uses a novel loss function when fitting functions for regression adjustment. We release the dataset and benchmark in a Python package; the package is built in a modular way to facilitate new datasets and estimators.
Error bounds for Lanczos-based matrix function approximation
by
Musco, Christopher
,
Greenbaum, Anne
,
Chen, Tyler
in
Approximation
,
Cauchy integral formula
,
Eigenvalues
2022
We analyze the Lanczos method for matrix function approximation (Lanczos-FA), an iterative algorithm for computing \\(f(\\mathbf{A}) \\mathbf{b}\\) when \\(\\mathbf{A}\\) is a Hermitian matrix and \\(\\mathbf{b}\\) is a given vector. Assuming that \\(f : \\mathbb{C} \\rightarrow \\mathbb{C}\\) is piecewise analytic, we give a framework, based on the Cauchy integral formula, which can be used to derive a priori and a posteriori error bounds for Lanczos-FA in terms of the error of Lanczos used to solve linear systems. Unlike many error bounds for Lanczos-FA, these bounds account for fine-grained properties of the spectrum of \\(\\mathbf{A}\\), such as clustered or isolated eigenvalues. Our results are derived assuming exact arithmetic, but we show that they are easily extended to finite precision computations using existing theory about the Lanczos algorithm in finite precision. We also provide generalized bounds for the Lanczos method used to approximate quadratic forms \\(\\mathbf{b}^\\textsf{H} f(\\mathbf{A}) \\mathbf{b}\\), and demonstrate the effectiveness of our bounds with numerical experiments.