Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Estimating parameters for probabilistic linkage of privacy-preserved datasets
by
Boyd, James H.
, Brown, Adrian P.
, Ferrante, Anna M.
, Randall, Sean M.
, Semmens, James B.
in
Agreements
/ Algorithms
/ Analysis
/ Computer Security
/ Data Accuracy
/ Data collection
/ Data quality
/ Datasets as Topic
/ Disclosure of information
/ Health Sciences
/ Humans
/ Information management
/ Linkage quality
/ Medical Record Linkage - methods
/ Medicine
/ Medicine & Public Health
/ Methods
/ Parameter estimation
/ Privacy
/ Probabilistic
/ Probability
/ quality
/ Record linkage
/ reporting
/ Reproducibility of Results
/ Research Article
/ Research methodology
/ Software
/ Statistical Theory and Methods
/ Statistics for Life Sciences
/ Theory of Medicine/Bioethics
2017
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Estimating parameters for probabilistic linkage of privacy-preserved datasets
by
Boyd, James H.
, Brown, Adrian P.
, Ferrante, Anna M.
, Randall, Sean M.
, Semmens, James B.
in
Agreements
/ Algorithms
/ Analysis
/ Computer Security
/ Data Accuracy
/ Data collection
/ Data quality
/ Datasets as Topic
/ Disclosure of information
/ Health Sciences
/ Humans
/ Information management
/ Linkage quality
/ Medical Record Linkage - methods
/ Medicine
/ Medicine & Public Health
/ Methods
/ Parameter estimation
/ Privacy
/ Probabilistic
/ Probability
/ quality
/ Record linkage
/ reporting
/ Reproducibility of Results
/ Research Article
/ Research methodology
/ Software
/ Statistical Theory and Methods
/ Statistics for Life Sciences
/ Theory of Medicine/Bioethics
2017
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Estimating parameters for probabilistic linkage of privacy-preserved datasets
by
Boyd, James H.
, Brown, Adrian P.
, Ferrante, Anna M.
, Randall, Sean M.
, Semmens, James B.
in
Agreements
/ Algorithms
/ Analysis
/ Computer Security
/ Data Accuracy
/ Data collection
/ Data quality
/ Datasets as Topic
/ Disclosure of information
/ Health Sciences
/ Humans
/ Information management
/ Linkage quality
/ Medical Record Linkage - methods
/ Medicine
/ Medicine & Public Health
/ Methods
/ Parameter estimation
/ Privacy
/ Probabilistic
/ Probability
/ quality
/ Record linkage
/ reporting
/ Reproducibility of Results
/ Research Article
/ Research methodology
/ Software
/ Statistical Theory and Methods
/ Statistics for Life Sciences
/ Theory of Medicine/Bioethics
2017
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Estimating parameters for probabilistic linkage of privacy-preserved datasets
Journal Article
Estimating parameters for probabilistic linkage of privacy-preserved datasets
2017
Request Book From Autostore
and Choose the Collection Method
Overview
Background
Probabilistic record linkage is a process used to bring together person-based records from within the same dataset (de-duplication) or from disparate datasets using pairwise comparisons and matching probabilities. The linkage strategy and associated match probabilities are often estimated through investigations into data quality and manual inspection. However, as privacy-preserved datasets comprise encrypted data, such methods are not possible. In this paper, we present a method for estimating the probabilities and threshold values for probabilistic privacy-preserved record linkage using Bloom filters.
Methods
Our method was tested through a simulation study using synthetic data, followed by an application using real-world administrative data. Synthetic datasets were generated with error rates from zero to 20% error. Our method was used to estimate parameters (probabilities and thresholds) for de-duplication linkages. Linkage quality was determined by F-measure. Each dataset was privacy-preserved using separate Bloom filters for each field. Match probabilities were estimated using the expectation-maximisation (EM) algorithm on the privacy-preserved data. Threshold cut-off values were determined by an extension to the EM algorithm allowing linkage quality to be estimated for each possible threshold. De-duplication linkages of each privacy-preserved dataset were performed using both estimated and calculated probabilities. Linkage quality using the F-measure at the estimated threshold values was also compared to the highest F-measure. Three large administrative datasets were used to demonstrate the applicability of the probability and threshold estimation technique on real-world data.
Results
Linkage of the synthetic datasets using the estimated probabilities produced an F-measure that was comparable to the F-measure using calculated probabilities, even with up to 20% error. Linkage of the administrative datasets using estimated probabilities produced an F-measure that was higher than the F-measure using calculated probabilities. Further, the threshold estimation yielded results for F-measure that were only slightly below the highest possible for those probabilities.
Conclusions
The method appears highly accurate across a spectrum of datasets with varying degrees of error. As there are few alternatives for parameter estimation, the approach is a major step towards providing a complete operational approach for probabilistic linkage of privacy-preserved datasets.
Publisher
BioMed Central,BioMed Central Ltd,Springer Nature B.V,BMC
Subject
This website uses cookies to ensure you get the best experience on our website.