Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Show word count" is not quite useful for Chinese and Japanese #700

Open
Explorer09 opened this issue Oct 29, 2023 · 2 comments
Open

"Show word count" is not quite useful for Chinese and Japanese #700

Explorer09 opened this issue Oct 29, 2023 · 2 comments

Comments

@Explorer09
Copy link

The "Show word count" is not quite useful is not quite useful in languages such as Chinese and Japanese, where ASCII spaces (U+0020) are seldom used in delimiting words. It might be useful for Korean, but I can't say that for sure as I don't use Korean regularly in my life (I'm from Taiwan, by the way).

In Chinese the words are not delimited in spaces. In most Chinese writing, we are more interested in character counts rather than word counts.

Japanese seldom delimits words in spaces, but when it does, certain characters might be used in place of spaces. The most common are U+3000 (Ideographic Space) and U+30FB (Katakana Middle Dot).

The above are the overall usability issues with the word count feature. The technical issues are as follows:

  • It does not recognize Unicode whitespaces other than ASCII spaces and newlines. U+3000 is a Unicode whitespace.
  • The Chinese translation of the feature is inaccurate (Wrong: 顯示字數; Correct: 顯示英語單詞數)
  • The word count is inaccurate even when multiple languages are mixed together in a text.

Test text:

ブレス・オフ・ザ・ワイルド(U+30FB_KatakanaMiddleDot)
ブレス オフ ザ ワイルド(U+3000_IdeographicSpace)
브레스 오브 더 와일드

Actual result: 6
Expected result: I don't know, maybe 9

20231029_161246
(Screenshot taken with Simple Notes 6.17.0)

@Aga-C
Copy link
Contributor

Aga-C commented Oct 29, 2023

It's a duplicate of #357.

@Explorer09
Copy link
Author

@Aga-C Not quite a duplicate. There are multiple issues in my report. Besides, mine focuses on the usability of the whole feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants