Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wikipedia tool issue with UTF-8 #10566

Open
marcoagpinto opened this issue May 4, 2024 · 0 comments
Open

Wikipedia tool issue with UTF-8 #10566

marcoagpinto opened this issue May 4, 2024 · 0 comments

Comments

@marcoagpinto
Copy link
Member

marcoagpinto commented May 4, 2024

Heya,

I know it is 5am and everyone is sleeping, but I have been working on LanguageTool.

I have faced an issue with:
java -Dfile.encoding=UTF-8 -Xmx4500M -jar languagetool-wikipedia.jar check-data -l pt-PT -r PÔR_FIM_À_VIDA -f pt-BR.txt --max-sentences 900000 --context-size 100 >0.txt

It seems the accents become all messed up:

WARNING: Could not find rule 'PÔR_FIM_À_VIDA'
Only these rules are enabled: [PÔR_FIM_À_VIDA]
Working on: pt-BR.txt
Sentence limit: 900000
Context size: 100
Error limit: no limit
Skip: 0

I have been using:
LanguageTool-wikipedia-20240426-snapshot

Is it a known issue? Maybe just a matter of updating to a more recent version?

Thanks!

EDIT: Ahhhh… I have Windows 11.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant