Preprint
Review

An Introduction to Probabilistic Record Linkage

Altmetrics

Downloads

355

Views

232

Comments

0

A peer-reviewed article of this preprint also exists.

Submitted:

03 August 2020

Posted:

04 August 2020

You are already at the latest version

Alerts
Abstract
Since its post-World War II inception, the science of record linkage has grown exponentially and is used across industrial, governmental, and academic agencies. The academic fields that rely on record linkage are diverse, ranging from history to public health to demography. In this paper, we introduce the different types of data linkage and give a historical context to their development. We then introduce the three types of underlying models for probabilistic record linkage: Fellegi-Sunter based methods, machine learning methods, and Bayesian methods. Practical considerations such as data standardization and privacy concerns are then discussed. Finally, recommendations are given for organizations developing or maintaining record linkage programs, with an emphasis on organizations measuring long-term complications of disasters such as 9/11.
Keywords: 
Subject: Computer Science and Mathematics  -   Probability and Statistics
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated