Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Differentially Private Synthetic Data
by
He, Yiyun
in
Information science
/ Mathematics
/ Mathematics education
2025
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Differentially Private Synthetic Data
by
He, Yiyun
in
Information science
/ Mathematics
/ Mathematics education
2025
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Dissertation
Differentially Private Synthetic Data
2025
Request Book From Autostore
and Choose the Collection Method
Overview
Differentially private synthetic data provide a powerful mechanism to enable data analysis while protecting sensitive information about individuals. In this thesis, we present a highly effective algorithmic approach for generating ε-differentially private synthetic data in a bounded metric space with near-optimal utility guarantees under the 1-Wasserstein distance. In particular, for a dataset X in the hypercube [0, 1] d, our algorithm generates synthetic dataset Y such that the expected 1-Wasserstein distance between the empirical measure of X and Y is O((εn)−1/d) for d ≥ 2, and is O(log2 (εn)(εn)−1) for d = 1. The accuracy guarantee is optimal up to a constant factor for d ≥ 2, and up to a logarithmic factor for d = 1. Our algorithm has a fast running time of O(εdn) for all d ≥ 1 and demonstrates improved accuracy compared to former results for d ≥ 2.However, when the data lie in a high-dimensional space, the accuracy of the synthetic data suffers from the curse of dimensionality. We further propose a differentially private algorithm to generate low-dimensional synthetic data efficiently from a high-dimensional dataset with a utility guarantee with respect to the 1-Wasserstein distance. A key step of our algorithm is a private principal component analysis (PCA) procedure with a near-optimal accuracy bound that circumvents the curse of dimensionality. Unlike the standard perturbation analysis, our analysis of private PCA works without assuming the spectral gap for the covariance matrix. For the data lying on a d′ -dimensional linear subspace, we successfully overcome the curse of high dimensionality and improve the accuracy to O(n−1/d′ ).We also consider the synthetic data generation with differential privacy under the online setting where data is continually released. For a data stream within the hypercube [0, 1]d and an infinite time horizon, we develop an online algorithm that generates a differentially private synthetic dataset at each time t. This algorithm achieves a near-optimal accuracy bound of O(log(t)t −1/d) for d ≥ 2 and O(log4.5 (t)t −1) for d = 1 in the 1-Wasserstein distance. This result extends the previous work on the continual release model for counting queries to Lipschitz queries. Compared to the offline case, where the entire dataset is available at once, our approach requires only an extra polynomially logarithmic factor in the accuracy bound.
Publisher
ProQuest Dissertations & Theses
Subject
ISBN
9798288800795
MBRLCatalogueRelatedBooks
Related Items
Related Items
This website uses cookies to ensure you get the best experience on our website.