Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What should we use as identifiers in a national dataset #303

Open
colmjude opened this issue Jul 13, 2020 · 0 comments
Open

What should we use as identifiers in a national dataset #303

colmjude opened this issue Jul 13, 2020 · 0 comments

Comments

@colmjude
Copy link
Contributor

colmjude commented Jul 13, 2020

Data, produced by organisations, often includes an identifier for a row. Any corresponding standard (e.g. Brownfield) states this should be unique. However, it technically only needs to be unique for that organisation. This means there could be clashes when we combine data from multiple sources.

How can we avoid this issue?

In some cases where the row references something else, for example, a URL or a document we could make a hash that would be unique to thing (that hash would change if the URL or document changed). This is what we did in the local-plans-prototype. A downside of this approach is that the id loses all meaning for a user glancing at the data.

Another option is to use the ids an organisation gave the row of data, and prefix it with something. For example, the id might be bfs123456 and we could prefix it with the organisation identifier. Resulting in something like local-authority-eng:HAG:bfs123456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant