Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CJK Unified Ideographs? #884

Open
Myonmu opened this issue Nov 20, 2023 · 1 comment
Open

Support CJK Unified Ideographs? #884

Myonmu opened this issue Nov 20, 2023 · 1 comment

Comments

@Myonmu
Copy link

Myonmu commented Nov 20, 2023

I'm currently trying to add Japanese characters to supported character set but I have some concern for adding Kanji.

In Unicode the block we are interested in is CJK Unified Ideographs, which, actually contains characters from Chinese, Japanese and Korean: link

The problem is the complete set contains 20992 characters, both common and rare, would that cause performance issue?

I managed to extract JIS X 0208 characters (around 6000 Kanji) but using CharacterRange.Define is painful as there are lots of "holes" that need to be removed by using exclude:. Using a file to enumerate these characters is surely another option.

Finally, since characters can be shared between Chinese, Japanese and Korean in the CJK block, it might just be convenient to add the whole block, if it doesn't heavily impact performance.

Attachment : an ordered set of JIS X 0208 Kanji characters.
JIS0208Kanji.txt

@Myonmu
Copy link
Author

Myonmu commented Nov 20, 2023

Side note: I compiled inklecate with the complete CJK range and plugged it in Inky, it works without noticeable lag. However Inky's autocomplete doesn't recognize them, but autocomplete is already having trouble with latin extend so... I don't know if that should be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant