Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions for supporting other languages? #34

Open
niftylettuce opened this issue May 4, 2020 · 2 comments
Open

Suggestions for supporting other languages? #34

niftylettuce opened this issue May 4, 2020 · 2 comments

Comments

@niftylettuce
Copy link

Not sure if you've had to work on this @moos but curious if you or others have figured out support for multiple locales/langs.

@moos
Copy link
Owner

moos commented May 17, 2020

"WordNet® is a large lexical database of English." wordpos is just a Javascript front to the WordNet database.

Just as I was about to hit send on above, I re-thought your question and a quick Google search later came across the Global WordNet Association
-- apparently there are "WordNet"s in other languages.

I looked at a few (German, French, Japanese) -- they seem to have the data in various XML formats -- so not a dropin for wordpos which is based on a specific (optimized) WordNet format.

Still an intriguing possibility -- based mainly on level of interest and contributions. Do you have a specific use case in mind?

@niftylettuce
Copy link
Author

Yes, it is towards my efforts with https://github.com/spamscanner/spamscanner. I am building a filter to detect gibberish and the language of the message (and not be reliant upon Content-Language or <meta> or <html lang tags.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants