Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep entity types returned by DBpedia Spotlight #24

Closed
ChristophLeonhardt opened this issue Jan 26, 2024 · 2 comments
Closed

Keep entity types returned by DBpedia Spotlight #24

ChristophLeonhardt opened this issue Jan 26, 2024 · 2 comments

Comments

@ChristophLeonhardt
Copy link
Collaborator

DBpedia Spotlight returns not only URIs for entities but also entity types. In get_dbpedia_uris() these values are currently omitted from the output. If kept, these entity types could be used to classify entities without additional SPARQL queries, for example if the textual data does not contain pre-annotated named entities.

@ablaette
Copy link
Contributor

This will be absolutely useful, to keep the number of arguments minimal: I do not think, we need the additional argument. We should return types by default, and leave it to users to delete the column, if it is not wanted.

However, what we should do: document the columns of the return value, so that we know what we get.

@ablaette
Copy link
Contributor

This is a minimal example to convey what is implemented.

library(dbpedia)

uris <- dbpedia_uris <- get_dbpedia_uris(
  quanteda::data_char_ukimmig2010[["Labour"]],
  language = "en",
  api = "http://api.dbpedia-spotlight.org/en/annotate"
)

We now have a list 'types' in the output table, which is a list with the parsed result. Potentially, this is not yet the ideal data representation, but I would leave the further discussion to: #27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants