Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Disambiguation of names? #9

Open
a3nm opened this issue Jun 29, 2020 · 2 comments
Open

Feature request: Disambiguation of names? #9

a3nm opened this issue Jun 29, 2020 · 2 comments

Comments

@a3nm
Copy link

a3nm commented Jun 29, 2020

Hi,

I'm a bit puzzled to see companies listed only by name, wouldn't there be some kind of ambiguity about which is meant? Shouldn't the list be augmented with precise identifiers, e.g., tax registration identifiers in some countries, or a Wikidata identifier?

@frnsys
Copy link
Collaborator

frnsys commented Jun 30, 2020

this is a great point, we are still sorting out how best to identify companies. For now we are using the company's listed name on major stock exchanges, though of course this has shortcomings for companies not listed on these exchanges. In your experience how good is the Wikidata coverage? I feel something like tax registration identifier would be preferable since it aligns with public records but the variation across countries might make it too complicated.

@a3nm
Copy link
Author

a3nm commented Jul 10, 2020

Hi, you can expect that Wikidata will include all companies that are notable enough to have a Wikipedia page. But entities could be created for the other companies, and information like tax registration identifiers could be added to the Wikidata entities. To avoid doing the mapping by hand, it may be possible to align your list using something like OpenRefine (this is about refining data, not oil ;)) and its Wikidata plugin. There is detailed documentation and a video tutorial here.

The reason why I think this disambiguation point is relevant is because it would make it possible to use your list for other applications related to open data about companies with a harmful effect on climate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants