Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security Docs #49

Open
hardbyte opened this issue Feb 14, 2018 · 0 comments
Open

Security Docs #49

hardbyte opened this issue Feb 14, 2018 · 0 comments

Comments

@hardbyte
Copy link
Collaborator

hardbyte commented Feb 14, 2018

We need to add a note on security...

The Cryptographic Longterm Key is computed and compared following the method described
by Rainer Schnell, Tobias Bachteler, and Jörg Reiher in A Novel Error-Tolerant Anonymous Linking Code. We have deviated from their approach by using a KDF (Key derivation function) to ensure the hash functions are independent. See the detailed discussion in Who Is 1011011111…1110110010? Automated Cryptanalysis of Bloom Filter Encryptions of Databases with Several Personal Identifiers where Kroll and Steinmetzer present cryptanalysis and an attack on Bloom filters built from multiple identifiers.

Semantic Security

The semantic security of the CLKs depends on two factors:

  1. Multiple features are hashed to create the CLK. If just one or two features (e.g. name) are used, population statistics on the distribution of names can be used to identify records based solely on the CLKs. See the paper Cryptanalysis of Basic Bloom Filters Used for Privacy Preserving Record Linkage for an indepth analysis.

  2. The HMAC secret that the entity providing organizations share is unique (not reused between mappings) and is kept secret from the entity carrying out the linkage operation.

Possible Weaknesses

When creating the bi-grams, the first and last bi-gram are padded with a whitespace. This is a
weakness, because it allows an attacker more easily to find the beginning and the end of a word.
Need to investigate if dropping the padding decreases matching accuracy.

Aha! Link: https://csiro.aha.io/features/ANONLINK-49

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants