Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Swedish language #149

Open
EmilStenstrom opened this issue Aug 15, 2022 · 6 comments
Open

Support for Swedish language #149

EmilStenstrom opened this issue Aug 15, 2022 · 6 comments
Assignees
Milestone

Comments

@EmilStenstrom
Copy link

Hi!

In the list of languages I don't see Swedish. It's a small language, but has a very big wikipedia with ~2.5M articles. Can entity-fishing be trained on swedish, or is there some deeper reason that it's not included?

@kermitt2
Copy link
Owner

Hi @EmilStenstrom !

Thank you for the request. Swedish should work well indeed given the size of its Wikipedia. I think it's the largest one not support by entity-fishing yet, with Dutch. It will try to include it in the next batch of supported languages.

@kermitt2 kermitt2 self-assigned this Aug 17, 2022
@EmilStenstrom
Copy link
Author

That sounds awesome! Looking forward to testing it! :)

@kermitt2 kermitt2 added this to the 0.0.6 milestone Jan 20, 2023
@kermitt2
Copy link
Owner

Screenshot from 2023-01-20 18-44-34

@EmilStenstrom
Copy link
Author

Nice! Happy to see it disambiguate Swedish. Looking at that specific example, the things it mentions are not entities, but they are “concepts”. Translated: “year”, “consumption”, “health”. Is that intentional?

@kermitt2
Copy link
Owner

Yes that's the goal, every Wikidata entities is disambiguated, based on the Wikipedia anchors - Wikidata calls "entities" the concepts and their instances. We can then refine based on the statements P279 and P31 to select what's wanted for a given task/application. Another one more:

Screenshot from 2023-01-21 21-16-59

@EmilStenstrom
Copy link
Author

Awesome! Using wikidata statements to select what you want is super powerful. Eager to try this out when 0.0.6 is released! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants