Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
40
result(s) for
"Cisewski, Jessi"
Sort by:
GENERALIZED FIDUCIAL INFERENCE FOR NORMAL LINEAR MIXED MODELS
2012
While linear mixed modeling methods are foundational concepts introduced in any statistical education, adequate general methods for interval estimation involving models with more than a few variance components are lacking, especially in the unbalanced setting. Generalized fiducial inference provides a possible framework that accommodates this absence of methodology. Under the fabric of generalized fiducial inference along with sequential Monte Carlo methods, we present an approach for interval estimation for both balanced and unbalanced Gaussian linear mixed models. We compare the proposed method to classical and Bayesian results in the literature in a simulation study of two-fold nested models and two-factor crossed designs with an interaction term. The proposed method is found to be competitive or better when evaluated based on frequentist criteria of empirical coverage and average length of confidence intervals for small sample sizes. A MATLAB implementation of the proposed algorithm is available from the authors.
Journal Article
Completing the Results of the 2013 Boston Marathon
by
Cisewski, Jessi
,
Dominici, Francesca
,
Paulson, Charles
in
Algorithms
,
Analysis
,
Biology and Life Sciences
2014
The 2013 Boston marathon was disrupted by two bombs placed near the finish line. The bombs resulted in three deaths and several hundred injuries. Of lesser concern, in the immediate aftermath, was the fact that nearly 6,000 runners failed to finish the race. We were approached by the marathon's organizers, the Boston Athletic Association (BAA), and asked to recommend a procedure for projecting finish times for the runners who could not complete the race. With assistance from the BAA, we created a dataset consisting of all the runners in the 2013 race who reached the halfway point but failed to finish, as well as all runners from the 2010 and 2011 Boston marathons. The data consist of split times from each of the 5 km sections of the course, as well as the final 2.2 km (from 40 km to the finish). The statistical objective is to predict the missing split times for the runners who failed to finish in 2013. We set this problem in the context of the matrix completion problem, examples of which include imputing missing data in DNA microarray experiments, and the Netflix prize problem. We propose five prediction methods and create a validation dataset to measure their performance by mean squared error and other measures. The best method used local regression based on a K-nearest-neighbors algorithm (KNN method), though several other methods produced results of similar quality. We show how the results were used to create projected times for the 2013 runners and discuss potential for future application of the same methodology. We present the whole project as an example of reproducible research, in that we are able to make the full data and all the algorithms we have used publicly available, which may facilitate future research extending the methods or proposing completely different approaches.
Journal Article
Standards for Modest Bayesian Credences
by
Schervish, Mark J.
,
Cisewski, Jessi
,
Stern, Rafael
in
Attitudes
,
Bayesian analysis
,
Conditioning
2018
Gordon Belot argues that Bayesian theory is epistemologically immodest. In response, we show that the topological conditions that underpin his criticisms of asymptotic Bayesian conditioning are self-defeating. They require extreme a priori credences regarding, for example, the limiting behavior of observed relative frequencies. We offer a different explication of Bayesian modesty using a goal of consensus: rival scientific opinions should be responsive to new facts as a way to resolve their disputes. Also we address Adam Elga’s rebuttal to Belot’s analysis, which focuses attention on the role that the assumption of countable additivity plays in Belot’s criticisms.
Journal Article
Sleeping Beauty’s Credences
by
Schervish, Mark J.
,
Cisewski, Jessi
,
Stern, Rafael
in
Bayesian analysis
,
Beauty
,
Conditioning
2016
The Sleeping Beauty problem has spawned a debate between “thirders” and “halfers” who draw conflicting conclusions about Sleeping Beauty's credence that a coin lands heads. Our analysis is based on a probability model for what Sleeping Beauty knows at each time during the experiment. We show that conflicting conclusions result from different modeling assumptions that each group makes. Our analysis uses a standard “Bayesian” account of rational belief with conditioning. No special handling is used for self-locating beliefs or centered propositions. We also explore what fair prices Sleeping Beauty computes for gambles that she might be offered during the experiment.
Journal Article
A HERMITE–GAUSSIAN BASED EXOPLANET RADIAL VELOCITY ESTIMATION METHOD
2021
As the first successful technique used to detect exoplanets orbiting distant stars, the radial velocity method aims to detect a periodic Doppler shift in a stellar spectrum due to the star's motion along the line sight. We introduce a new, mathematically rigorous approach to detect such a signal that accounts for the smooth functional relationship of neighboring wavelengths in the spectrum, minimizes the role of wavelength interpolation, accounts for heteroskedastic noise and easily allows for accurate calculation of the estimated radial velocity standard error. Using Hermite–Gaussian functions, we show that the problem of detecting a Doppler shift in the spectrum can be reduced to linear regression in many settings. A simulation study demonstrates that the proposed method is able to accurately estimate an individual spectrum's radial velocity with precision below 0.3 m s−1, corresponding to a Doppler shift much smaller than the size of a spectral pixel. Furthermore, the new method outperforms the traditional cross-correlation function approach for estimating the radial velocity by reducing the root mean squared error up to 15 cm s−1. The proposed method is also demonstrated on a new set of observations from the EXtreme PREcision Spectrometer (EXPRES) for the host star 51 Pegasi, and successfully recovers estimates of the planetary companion's parameters that agree well with previous studies. The method is implemented in the R package rvmethod, and supplemental Python code is also available.
Journal Article
High-energy Neutrino Source Cross-correlations with Nearest-neighbor Distributions
by
Fang, Ke
,
Cisewski-Kehe, Jessi
,
Banerjee, Arka
in
Cross correlation
,
Distribution functions
,
Galaxies
2025
The astrophysical origins of the majority of the IceCube neutrinos remain unknown. Effectively characterizing the spatial distribution of the neutrino samples and associating the events with astrophysical source catalogs can be challenging given the large atmospheric neutrino background and underlying non-Gaussian spatial features in the neutrino and source samples. In this paper, we investigate a framework for identifying and statistically evaluating the cross-correlations between IceCube data and an astrophysical source catalog based on the \\(k\\)-nearest-neighbor cumulative distribution functions (\\(k\\)NN-CDFs). We propose a maximum likelihood estimation procedure for inferring the true proportions of astrophysical neutrinos in the point-source data. We conduct a statistical power analysis of an associated likelihood ratio test with estimations of its sensitivity and discovery potential with synthetic neutrino data samples and a WISE-2MASS galaxy sample. We apply the method to IceCube's public ten-year point-source data and find no statistically significant evidence for spatial cross-correlations with the selected galaxy sample. We discuss possible extensions to the current method and explore the method's potential to identify the cross-correlation signals in data sets with different sample sizes.
MaxTDA: Robust Statistical Inference for Maximal Persistence in Topological Data Analysis
2025
Persistent homology is an area within topological data analysis (TDA) that can uncover different dimensional holes (connected components, loops, voids, etc.) in data. The holes are characterized, in part, by how long they persist across different scales. Noisy data can result in many additional holes that are not true topological signal. Various robust TDA techniques have been proposed to reduce the number of noisy holes, however, these robust methods have a tendency to also reduce the topological signal. This work introduces Maximal TDA (MaxTDA), a statistical framework addressing a limitation in TDA wherein robust inference techniques systematically underestimate the persistence of significant homological features. MaxTDA combines kernel density estimation with level-set thresholding via rejection sampling to generate consistent estimators for the maximal persistence features that minimizes bias while maintaining robustness to noise and outliers. We establish the consistency of the sampling procedure and the stability of the maximal persistence estimator. The framework also enables statistical inference on topological features through rejection bands, constructed from quantiles that bound the estimator's deviation probability. MaxTDA is particularly valuable in applications where precise quantification of statistically significant topological features is essential for revealing underlying structural properties in complex datasets. Numerical simulations across varied datasets, including an example from exoplanet astronomy, highlight the effectiveness of MaxTDA in recovering true topological signals.
A Subsequence Approach to Topological Data Analysis for Irregularly-Spaced Time Series
2024
A time-delay embedding (TDE), grounded in the framework of Takens's Theorem, provides a mechanism to represent and analyze the inherent dynamics of time-series data. Recently, topological data analysis (TDA) methods have been applied to study this time series representation mainly through the lens of persistent homology. Current literature on the fusion of TDE and TDA are adept at analyzing uniformly-spaced time series observations. This work introduces a novel {\\em subsequence} embedding method for irregularly-spaced time-series data. We show that this method preserves the original state space topology while reducing spurious homological features. Theoretical stability results and convergence properties of the proposed method in the presence of noise and varying levels of irregularity in the spacing of the time series are established. Numerical studies and an application to real data illustrates the performance of the proposed method.
A Divide-and-Conquer Approach to Persistent Homology
2024
Persistent homology is a tool of topological data analysis that has been used in a variety of settings to characterize different dimensional holes in data. However, persistent homology computations can be memory intensive with a computational complexity that does not scale well as the data size becomes large. In this work, we propose a divide-and-conquer (DaC) method to mitigate these issues. The proposed algorithm efficiently finds small, medium, and large-scale holes by partitioning data into sub-regions and uses a Vietoris-Rips filtration. Furthermore, we provide theoretical results that quantify the bottleneck distance between DaC and the true persistence diagram and the recovery probability of holes in the data. We empirically verify that the rate coincides with our theoretical rate, and find that the memory and computational complexity of DaC outperforms an alternative method that relies on a clustering preprocessing step to reduce the memory and computational complexity of the persistent homology computations. Finally, we test our algorithm using spatial data of the locations of lakes in Wisconsin, where the classical persistent homology is computationally infeasible.
Generalized fiducial inference for normal linear mixed models
2012
While linear mixed modeling methods are foundational concepts introduced in any statistical education, adequate general methods for interval estimation involving models with more than a few variance components are lacking, especially in the unbalanced setting. Generalized fiducial inference provides a possible framework that accommodates this absence of methodology. Under the fabric of generalized fiducial inference along with sequential Monte Carlo methods, we present an approach for interval estimation for both balanced and unbalanced Gaussian linear mixed models. We compare the proposed method to classical and Bayesian results in the literature in a simulation study of two-fold nested models and two-factor crossed designs with an interaction term. The proposed method is found to be competitive or better when evaluated based on frequentist criteria of empirical coverage and average length of confidence intervals for small sample sizes. A MATLAB implementation of the proposed algorithm is available from the authors.