We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
You can continue the conversation there. Go to discussion →
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does Stork support CJK languages? (Chinese, Japanese, and Korean)
I am interested in using Stork for Zola. I proposed it here: getzola/zola#1849
It was mentioned that there may not be specific stemmers/stopword lists for languages other than English?
EDIT: (I researched this through some of the open issues, please correct me if I am wrong on any of this, thank you.)
Stopword lists: not implemented yet: #250
Stemmers: multilingual is already supported by snowball stem: #48 but it seems that CJK languages are not on the list for stemmers: https://snowballstem.org/algorithms/
Next I see that maybe stemmers are not applicable to CJK?:
Stemming is not a concept applicable to all languages. It is not, for example, applicable in Chinese. [ source ]
even if stemming is not applicable to CJK, it seems it can still be analyzed and improved with tokenization? https://www.microfocus.com/documentation/starteam/163/en/Help/SvrAdmin/GUID-DAC55170-60DC-490B-BC4F-42F4F45F6029.html
The text was updated successfully, but these errors were encountered:
I'm going to migrate this to a Discussion and continue there - hope that's okay.
Sorry, something went wrong.
No branches or pull requests
Does Stork support CJK languages? (Chinese, Japanese, and Korean)
I am interested in using Stork for Zola. I proposed it here: getzola/zola#1849
It was mentioned that there may not be specific stemmers/stopword lists for languages other than English?
EDIT: (I researched this through some of the open issues, please correct me if I am wrong on any of this, thank you.)
Stopword lists: not implemented yet: #250
Stemmers: multilingual is already supported by snowball stem: #48 but it seems that CJK languages are not on the list for stemmers: https://snowballstem.org/algorithms/
Next I see that maybe stemmers are not applicable to CJK?:
even if stemming is not applicable to CJK, it seems it can still be analyzed and improved with tokenization? https://www.microfocus.com/documentation/starteam/163/en/Help/SvrAdmin/GUID-DAC55170-60DC-490B-BC4F-42F4F45F6029.html
The text was updated successfully, but these errors were encountered: