Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
21
result(s) for
"Krometis, Justin"
Sort by:
Uncertainty Quantification in Data Fusion Classifier for Ship-Wake Detection
2024
Using deep learning model predictions requires not only understanding the model’s confidence but also its uncertainty, so we know when to trust the prediction or require support from a human. In this study, we used Monte Carlo dropout (MCDO) to characterize the uncertainty of deep learning image classification algorithms, including feature fusion models, on simulated synthetic aperture radar (SAR) images of persistent ship wakes. Comparing to a baseline, we used the distribution of predictions from dropout with simple mean value ensembling and the Kolmogorov—Smirnov (KS) test to classify in-domain and out-of-domain (OOD) test samples, created by rotating images to angles not present in the training data. Our objective was to improve the classification robustness and identify OOD images during the test time. The mean value ensembling did not improve the performance over the baseline, in that there was a –1.05% difference in the Matthews correlation coefficient (MCC) from the baseline model averaged across all SAR bands. The KS test, by contrast, saw an improvement of +12.5% difference in MCC and was able to identify the majority of OOD samples. Leveraging the full distribution of predictions improved the classification robustness and allowed labeling test images as OOD. The feature fusion models, however, did not improve the performance over the single SAR-band models, demonstrating that it is best to rely on the highest quality data source available (in our case, C-band).
Journal Article
Improving Deep Learning for Maritime Remote Sensing through Data Augmentation and Latent Space
by
Freeman, Laura
,
Sobien, Daniel
,
Krometis, Justin
in
Approximation
,
Computer aided design
,
Computer simulation
2022
Training deep learning models requires having the right data for the problem and understanding both your data and the models’ performance on that data. Training deep learning models is difficult when data are limited, so in this paper, we seek to answer the following question: how can we train a deep learning model to increase its performance on a targeted area with limited data? We do this by applying rotation data augmentations to a simulated synthetic aperture radar (SAR) image dataset. We use the Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction technique to understand the effects of augmentations on the data in latent space. Using this latent space representation, we can understand the data and choose specific training samples aimed at boosting model performance in targeted under-performing regions without the need to increase training set sizes. Results show that using latent space to choose training data significantly improves model performance in some cases; however, there are other cases where no improvements are made. We show that linking patterns in latent space is a possible predictor of model performance, but results require some experimentation and domain knowledge to determine the best options.
Journal Article
Parallel MCMC algorithms: theoretical foundations, algorithm design, case studies
by
Krometis, Justin A
,
Mondaini, Cecilia F
,
Holbrook, Andrew J
in
Algorithms
,
Case studies
,
Partial differential equations
2024
Abstract
Parallel Markov Chain Monte Carlo (pMCMC) algorithms generate clouds of proposals at each step to efficiently resolve a target probability distribution $\\mu $. We build a rigorous foundational framework for pMCMC algorithms that situates these methods within a unified ‘extended phase space’ measure-theoretic formalism. Drawing on our recent work that provides a comprehensive theory for reversible single-proposal methods, we herein derive general criteria for multiproposal acceptance mechanisms that yield ergodic chains on general state spaces. Our formulation encompasses a variety of methodologies, including proposal cloud resampling and Hamiltonian methods, while providing a basis for the derivation of novel algorithms. In particular, we obtain a top-down picture for a class of methods arising from ‘conditionally independent’ proposal structures. As an immediate application of this formalism, we identify several new algorithms including a multiproposal version of the popular preconditioned Crank–Nicolson (pCN) sampler suitable for high- and infinite-dimensional target measures that are absolutely continuous with respect to a Gaussian base measure. To supplement the aforementioned theoretical results, we carry out a selection of numerical case studies that evaluate the efficacy of these novel algorithms. First, noting that the true potential of pMCMC algorithms arises from their natural parallelizability and the ease with which they map to modern high-performance computing architectures, we provide a limited parallelization study using TensorFlow and a graphics processing unit to scale pMCMC algorithms that leverage as many as 100k proposals at each step. Second, we use our multiproposal pCN algorithm (mpCN) to resolve a selection of problems in Bayesian statistical inversion for partial differential equations motivated by fluid measurement. These examples provide preliminary evidence of the efficacy of mpCN for high-dimensional target distributions featuring complex geometries and multimodal structures.
Journal Article
Approximating Community Water System Service Areas to Explore the Demographics of SDWA Compliance in Virginia
by
Marcillo, Cristina
,
Krometis, Justin
,
Krometis, Leigh-Anne
in
Datasets
,
Demographics
,
Demography
2021
Although the United States Safe Drinking Water Act (SDWA) theoretically ensures drinking water quality, recent studies have questioned the reliability and equity associated with community water system (CWS) service. This study aimed to identify SDWA violation differences (i.e., monitoring and reporting (MR) and health-based (HB)) between Virginia CWSs given associated service demographics, rurality, and system characteristics. A novel geospatial methodology delineated CWS service areas at the zip code scale to connect 2000 US Census demographics with 2006–2016 SDWA violations, with significant associations determined via negative binomial regression. The proportion of Black Americans within a service area was positively associated with the likelihood of HB violations. This effort supports the need for further investigation of racial and socioeconomic disparities in access to safe drinking water within the United States in particular and offers a geospatial strategy to explore demographics in other settings where data on infrastructure extents are limited. Further interdisciplinary efforts at multiple scales are necessary to identify the entwined causes for differential risks in adverse drinking water quality exposures and would be substantially strengthened by the mapping of official CWS service boundaries.
Journal Article
ON BAYESIAN CONSISTENCY FOR FLOWS OBSERVED THROUGH A PASSIVE SCALAR
by
Krometis, Justin
,
Borggaard, Jeff
,
Glatt-Holtz, Nathan
in
Background noise
,
Bayesian analysis
,
Computational fluid dynamics
2020
We consider the statistical inverse problem of estimating a background fluid flow field v from the partial, noisy observations of the concentration θ of a substance passively advected by the fluid, so that θ is governed by the partial differential equation
∂
∂
t
θ
(
t
,
x
)
=
−
v
(
x
)
⋅
∇
θ
(
t
,
x
)
+
κ
Δ
θ
(
t
,
x
)
,
θ
(
0
,
x
)
=
θ
0
(
x
)
for t ∈ [0,T], T > 0 and x ∈ 𝕋² = [0, 1]². The initial condition θ₀ and diffusion coefficient κ are assumed to be known and the data consist of point observations of the scalar field θ corrupted by additive, i.i.d. Gaussian noise. We adopt a Bayesian approach to this estimation problem and establish that the inference is consistent, that is, that the posterior measure identifies the true background flow as the number of scalar observations grows large. Since the inverse map is ill-defined for some classes of problems even for perfect, infinite measurements of θ, multiple experiments (initial conditions) are required to resolve the true fluid flow. Under this assumption, suitable conditions on the observation points, and given support and tail conditions on the prior measure, we show that the posterior measure converges to a Dirac measure centered on the true flow as the number of observations goes to infinity.
Journal Article
A Comparison of Bayesian Methods for Integrated Test and Evaluation
2025
Testing defense systems in operationally realistic scenarios is typically logistically difficult and expensive. For this reason, Bayesian methods have gained significant interest in recent years as a means of shifting testing “left” in the acquisition lifecycle—that is, integrating information from earlier phases of test to reach conclusions about system performance more quickly and to better infer operational performance when data from such scenarios is limited. Bayesian inference mathematically quantifies assumptions in the form of selecting prior distributions on the unknown parameters and strategies for integrating data collected under different conditions. In this article, we compare several Bayesian approaches for integrated test and evaluation, using the example of estimating the reliability of the Stryker family of vehicles from developmental and operational test data. We compute posterior reliability estimates for each method and conduct a sensitivity analysis to measure how each assumption influences the results. Altogether, the analysis not only shows the promise of Bayesian integration of information, but also the importance of careful and justifiable assumptions to ensure defensible results.
Journal Article
A Framework for Using Priors in a Continuum of Testing
2024
A strength of the Bayesian paradigm is that it can leverage all available information, to include subject matter expert (SME) opinion and previous (possibly dissimilar) data, through prior probabilities (priors). This article develops a framework for thinking about how differently characterized priors can be appropriately used throughout the continuum of testing. In addition to the application of various priors, the application of the evolution of priors also contributes greatly to analytical understanding and will be addressed, considering cases such as when a system’s state significantly changes (e.g., is modified) during phases of testing. The evolution of priors can start with priors attempting to provide no information and evolve toward priors that capture the (newly) available information. This article further discusses priors based on institutional knowledge, as well as those based on previous testing data; the focus will be on previous, in some ways dissimilar, data, relative to a current test event. A discussion on which priors might be more common in various phases of testing, types of information that can be used in priors, and how priors evolve as information accumulates is also included. Finally, a real-world example using the Stryker family of vehicles demonstrates how priors can be employed in a continuum-of-testing construct.
Journal Article
A Statistical Framework for Domain Shape Estimation in Stokes Flows
by
Krometis, Justin A
,
Glatt-Holtz, Nathan E
,
Borggaard, Jeff
in
Algorithms
,
Bayesian analysis
,
Domains
2022
We develop and implement a Bayesian approach for the estimation of the shape of a two dimensional annular domain enclosing a Stokes flow from sparse and noisy observations of the enclosed fluid. Our setup includes the case of direct observations of the flow field as well as the measurement of concentrations of a solute passively advected by and diffusing within the flow. Adopting a statistical approach provides estimates of uncertainty in the shape due both to the non-invertibility of the forward map and to error in the measurements. When the shape represents a design problem of attempting to match desired target outcomes, this \"uncertainty\" can be interpreted as identifying remaining degrees of freedom available to the designer. We demonstrate the viability of our framework on three concrete test problems. These problems illustrate the promise of our framework for applications while providing a collection of test cases for recently developed Markov Chain Monte Carlo (MCMC) algorithms designed to resolve infinite dimensional statistical quantities.
Gaussian Process Assisted Meta-learning for Image Classification and Object Detection Models
by
Krometis, Justin A
,
Flowers, Anna R
,
Gramacy, Robert B
in
Data acquisition
,
Gaussian process
,
Image classification
2025
Collecting operationally realistic data to inform machine learning models can be costly. Before collecting new data, it is helpful to understand where a model is deficient. For example, object detectors trained on images of rare objects may not be good at identification in poorly represented conditions. We offer a way of informing subsequent data acquisition to maximize model performance by leveraging the toolkit of computer experiments and metadata describing the circumstances under which the training data was collected (e.g., season, time of day, location). We do this by evaluating the learner as the training data is varied according to its metadata. A Gaussian process (GP) surrogate fit to that response surface can inform new data acquisitions. This meta-learning approach offers improvements to learner performance as compared to data with randomly selected metadata, which we illustrate on both classic learning examples, and on a motivating application involving the collection of aerial images in search of airplanes.
Sacred and Profane: from the Involutive Theory of MCMC to Helpful Hamiltonian Hacks
2024
In the first edition of this Handbook, two remarkable chapters consider seemingly distinct yet deeply connected subjects ...