Skip to content
This repository has been archived by the owner on Nov 18, 2021. It is now read-only.

Latest commit

 

History

History
55 lines (36 loc) · 3.85 KB

civil-records-linkage.md

File metadata and controls

55 lines (36 loc) · 3.85 KB

Linkage of Dutch civil records (burgerLinker)

Metadata

  • Status: In Progress
  • Type: Specific
  • Work Package: WP4
  • Participating Institutes: International Institute of Social History (IISG) and Vrije Universiteit Amsterdam (VU)
  • Coordinators: Richard Zijdeman (IISG)
  • Developers: Joe Raad (VU)
  • End-users: The burgerLinker software is designed for the so called "digital historians" (e.g. someone with basic command line skills) who are interested in using the Dutch civil registries for their studies or linking their data to it.
  • Interest Groups: IG-LOD, IG-Workflows, and IG-Curation

Description

What is the research about?

Historians use archival records to describe persons' lives. Each record (e.g. a marriage record) just describes a point in time. Hence historians try to link multiple records on the same person to describe a life course. This tool focuses on "just" the linkage of civil records. By doing so, pedigrees of humans can be created over multiple generations for research on social inequality, especially in the part of health sciences where the focus is on gene-social contact interactions.

What problem is hindering the research?

This tool is being developed to improve and replace the current LINKS software. Points of improvement are:

  • extremely fast and scalable matching procedure (using Levenshtein automata and HDT);
  • considers all first names of individuals with multiple first names in order to find a candidate match;
  • blocking is not required (i.e. all candidate records can be considered for matching, with no restrictions on their registration date or location, and no requirements on blocking parts of their individual names);
  • detected links contains detailed provenance metadata, and can be saved in different formats (CSV and RDF are covered in the current version);
  • allows family and life course reconstruction (by computing the transitive closure over all detected links);
  • open software

Data

In its current version, the tool cannot be used to match entities from just any source. The current tool is solely focused on the linkage of civil records, relying on the sanguineous relations on the civil record, modelled according to our Civil Registries schema. An overview of the Civil Registries schema is available in the burgerLinker Wiki.

Software

LINKS source code and documentation

References

References to related resources and publications and especially links to related use-cases: