Catalogue Search | MBRL

Uncertainty Quantification in Data Fusion Classifier for Ship-Wake Detection

by Sobien, Daniel , Krometis, Justin , Kauffman, Justin A. in Algorithms , C band , Classification

2024

Using deep learning model predictions requires not only understanding the model’s confidence but also its uncertainty, so we know when to trust the prediction or require support from a human. In this study, we used Monte Carlo dropout (MCDO) to characterize the uncertainty of deep learning image classification algorithms, including feature fusion models, on simulated synthetic aperture radar (SAR) images of persistent ship wakes. Comparing to a baseline, we used the distribution of predictions from dropout with simple mean value ensembling and the Kolmogorov—Smirnov (KS) test to classify in-domain and out-of-domain (OOD) test samples, created by rotating images to angles not present in the training data. Our objective was to improve the classification robustness and identify OOD images during the test time. The mean value ensembling did not improve the performance over the baseline, in that there was a –1.05% difference in the Matthews correlation coefficient (MCC) from the baseline model averaged across all SAR bands. The KS test, by contrast, saw an improvement of +12.5% difference in MCC and was able to identify the majority of OOD samples. Leveraging the full distribution of predictions improved the classification robustness and allowed labeling test images as OOD. The feature fusion models, however, did not improve the performance over the single SAR-band models, demonstrating that it is best to rely on the highest quality data source available (in our case, C-band).

Journal Article

Share this book

Add to My Shelf

Improving Deep Learning for Maritime Remote Sensing through Data Augmentation and Latent Space

by Freeman, Laura , Sobien, Daniel , Krometis, Justin in Approximation , Computer aided design , Computer simulation

2022

Training deep learning models requires having the right data for the problem and understanding both your data and the models’ performance on that data. Training deep learning models is difficult when data are limited, so in this paper, we seek to answer the following question: how can we train a deep learning model to increase its performance on a targeted area with limited data? We do this by applying rotation data augmentations to a simulated synthetic aperture radar (SAR) image dataset. We use the Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction technique to understand the effects of augmentations on the data in latent space. Using this latent space representation, we can understand the data and choose specific training samples aimed at boosting model performance in targeted under-performing regions without the need to increase training set sizes. Results show that using latent space to choose training data significantly improves model performance in some cases; however, there are other cases where no improvements are made. We show that linking patterns in latent space is a possible predictor of model performance, but results require some experimentation and domain knowledge to determine the best options.

Journal Article

Share this book

Add to My Shelf

Parallel MCMC algorithms: theoretical foundations, algorithm design, case studies

by Krometis, Justin A , Mondaini, Cecilia F , Holbrook, Andrew J in Algorithms , Case studies , Partial differential equations

2024

Abstract Parallel Markov Chain Monte Carlo (pMCMC) algorithms generate clouds of proposals at each step to efficiently resolve a target probability distribution $\\mu $. We build a rigorous foundational framework for pMCMC algorithms that situates these methods within a unified ‘extended phase space’ measure-theoretic formalism. Drawing on our recent work that provides a comprehensive theory for reversible single-proposal methods, we herein derive general criteria for multiproposal acceptance mechanisms that yield ergodic chains on general state spaces. Our formulation encompasses a variety of methodologies, including proposal cloud resampling and Hamiltonian methods, while providing a basis for the derivation of novel algorithms. In particular, we obtain a top-down picture for a class of methods arising from ‘conditionally independent’ proposal structures. As an immediate application of this formalism, we identify several new algorithms including a multiproposal version of the popular preconditioned Crank–Nicolson (pCN) sampler suitable for high- and infinite-dimensional target measures that are absolutely continuous with respect to a Gaussian base measure. To supplement the aforementioned theoretical results, we carry out a selection of numerical case studies that evaluate the efficacy of these novel algorithms. First, noting that the true potential of pMCMC algorithms arises from their natural parallelizability and the ease with which they map to modern high-performance computing architectures, we provide a limited parallelization study using TensorFlow and a graphics processing unit to scale pMCMC algorithms that leverage as many as 100k proposals at each step. Second, we use our multiproposal pCN algorithm (mpCN) to resolve a selection of problems in Bayesian statistical inversion for partial differential equations motivated by fluid measurement. These examples provide preliminary evidence of the efficacy of mpCN for high-dimensional target distributions featuring complex geometries and multimodal structures.

Journal Article

Share this book

Add to My Shelf

Approximating Community Water System Service Areas to Explore the Demographics of SDWA Compliance in Virginia

by Marcillo, Cristina , Krometis, Justin , Krometis, Leigh-Anne in Datasets , Demographics , Demography

2021

Although the United States Safe Drinking Water Act (SDWA) theoretically ensures drinking water quality, recent studies have questioned the reliability and equity associated with community water system (CWS) service. This study aimed to identify SDWA violation differences (i.e., monitoring and reporting (MR) and health-based (HB)) between Virginia CWSs given associated service demographics, rurality, and system characteristics. A novel geospatial methodology delineated CWS service areas at the zip code scale to connect 2000 US Census demographics with 2006–2016 SDWA violations, with significant associations determined via negative binomial regression. The proportion of Black Americans within a service area was positively associated with the likelihood of HB violations. This effort supports the need for further investigation of racial and socioeconomic disparities in access to safe drinking water within the United States in particular and offers a geospatial strategy to explore demographics in other settings where data on infrastructure extents are limited. Further interdisciplinary efforts at multiple scales are necessary to identify the entwined causes for differential risks in adverse drinking water quality exposures and would be substantially strengthened by the mapping of official CWS service boundaries.

Journal Article

Share this book

Add to My Shelf

ON BAYESIAN CONSISTENCY FOR FLOWS OBSERVED THROUGH A PASSIVE SCALAR

by Krometis, Justin , Borggaard, Jeff , Glatt-Holtz, Nathan in Background noise , Bayesian analysis , Computational fluid dynamics

2020

We consider the statistical inverse problem of estimating a background fluid flow field v from the partial, noisy observations of the concentration θ of a substance passively advected by the fluid, so that θ is governed by the partial differential equation ∂ ∂ t θ ( t , x ) = − v ( x ) ⋅ ∇ θ ( t , x ) + κ Δ θ ( t , x ) , θ ( 0 , x ) = θ 0 ( x ) for t ∈ [0,T], T > 0 and x ∈ 𝕋² = [0, 1]². The initial condition θ₀ and diffusion coefficient κ are assumed to be known and the data consist of point observations of the scalar field θ corrupted by additive, i.i.d. Gaussian noise. We adopt a Bayesian approach to this estimation problem and establish that the inference is consistent, that is, that the posterior measure identifies the true background flow as the number of scalar observations grows large. Since the inverse map is ill-defined for some classes of problems even for perfect, infinite measurements of θ, multiple experiments (initial conditions) are required to resolve the true fluid flow. Under this assumption, suitable conditions on the observation points, and given support and tail conditions on the prior measure, we show that the posterior measure converges to a Dirac measure centered on the true flow as the number of observations goes to infinity.

Journal Article

Share this book

Add to My Shelf

A Comparison of Bayesian Methods for Integrated Test and Evaluation

by Freeman, Laura , Krometis, Justin , Provost, Kyle in RESEARCH ARTICLE

2025

Testing defense systems in operationally realistic scenarios is typically logistically difficult and expensive. For this reason, Bayesian methods have gained significant interest in recent years as a means of shifting testing “left” in the acquisition lifecycle—that is, integrating information from earlier phases of test to reach conclusions about system performance more quickly and to better infer operational performance when data from such scenarios is limited. Bayesian inference mathematically quantifies assumptions in the form of selecting prior distributions on the unknown parameters and strategies for integrating data collected under different conditions. In this article, we compare several Bayesian approaches for integrated test and evaluation, using the example of estimating the reliability of the Stryker family of vehicles from developmental and operational test data. We compute posterior reliability estimates for each method and conduct a sensitivity analysis to measure how each assumption influences the results. Altogether, the analysis not only shows the promise of Bayesian integration of information, but also the importance of careful and justifiable assumptions to ensure defensible results.

Journal Article

Share this book

Add to My Shelf

A Framework for Using Priors in a Continuum of Testing

by Sieck, Victoria R.C. , Krometis, Justin , Thorsen, Steven in RESEARCH ARTICLE

2024

A strength of the Bayesian paradigm is that it can leverage all available information, to include subject matter expert (SME) opinion and previous (possibly dissimilar) data, through prior probabilities (priors). This article develops a framework for thinking about how differently characterized priors can be appropriately used throughout the continuum of testing. In addition to the application of various priors, the application of the evolution of priors also contributes greatly to analytical understanding and will be addressed, considering cases such as when a system’s state significantly changes (e.g., is modified) during phases of testing. The evolution of priors can start with priors attempting to provide no information and evolve toward priors that capture the (newly) available information. This article further discusses priors based on institutional knowledge, as well as those based on previous testing data; the focus will be on previous, in some ways dissimilar, data, relative to a current test event. A discussion on which priors might be more common in various phases of testing, types of information that can be used in priors, and how priors evolve as information accumulates is also included. Finally, a real-world example using the Stryker family of vehicles demonstrates how priors can be employed in a continuum-of-testing construct.

Journal Article

Share this book

Add to My Shelf

A Statistical Framework for Domain Shape Estimation in Stokes Flows

by Krometis, Justin A , Glatt-Holtz, Nathan E , Borggaard, Jeff in Algorithms , Bayesian analysis , Domains

2022

We develop and implement a Bayesian approach for the estimation of the shape of a two dimensional annular domain enclosing a Stokes flow from sparse and noisy observations of the enclosed fluid. Our setup includes the case of direct observations of the flow field as well as the measurement of concentrations of a solute passively advected by and diffusing within the flow. Adopting a statistical approach provides estimates of uncertainty in the shape due both to the non-invertibility of the forward map and to error in the measurements. When the shape represents a design problem of attempting to match desired target outcomes, this \"uncertainty\" can be interpreted as identifying remaining degrees of freedom available to the designer. We demonstrate the viability of our framework on three concrete test problems. These problems illustrate the promise of our framework for applications while providing a collection of test cases for recently developed Markov Chain Monte Carlo (MCMC) algorithms designed to resolve infinite dimensional statistical quantities.

Paper

Share this book

Add to My Shelf

Gaussian Process Assisted Meta-learning for Image Classification and Object Detection Models

by Krometis, Justin A , Flowers, Anna R , Gramacy, Robert B in Data acquisition , Gaussian process , Image classification

2025

Collecting operationally realistic data to inform machine learning models can be costly. Before collecting new data, it is helpful to understand where a model is deficient. For example, object detectors trained on images of rare objects may not be good at identification in poorly represented conditions. We offer a way of informing subsequent data acquisition to maximize model performance by leveraging the toolkit of computer experiments and metadata describing the circumstances under which the training data was collected (e.g., season, time of day, location). We do this by evaluating the learner as the training data is varied according to its metadata. A Gaussian process (GP) surrogate fit to that response surface can inform new data acquisitions. This meta-learning approach offers improvements to learner performance as compared to data with randomly selected metadata, which we illustrate on both classic learning examples, and on a motivating application involving the collection of aerial images in search of airplanes.

Paper

Share this book

Add to My Shelf

Sacred and Profane: from the Involutive Theory of MCMC to Helpful Hamiltonian Hacks

by Krometis, Justin A , Mondaini, Cecilia F , Holbrook, Andrew J

2024

In the first edition of this Handbook, two remarkable chapters consider seemingly distinct yet deeply connected subjects ...

Paper

Share this book

Add to My Shelf