ccnorm

Lua Unicode normalization data. It's kind of similar to Skeleton algorithm from Unicode tr39, while it considers readability and cases.

Latin letters

Any unicode that looks similar to a latin letter is normalized to latin letters, even if it's a number or a punctuation. Characters are normalized by shape for latin letters, so Greek letter ν (lower case Nu) is normalized to latin letter V.

Chinese characters

Chinese characters (a.k.a kanji) are normalized to Simplified Chinese as much as possible. The normalized Chinese sentence should be readable by native Chinese people.

Contributing

The ccnorm.lua is automatically generated, so please report bugs in Issues. Do not send pull requests.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
LICENSE		LICENSE
README.md		README.md
ccnorm.lua		ccnorm.lua
eccnorm.lua		eccnorm.lua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

ccnorm.lua

ccnorm.lua

eccnorm.lua

eccnorm.lua

Repository files navigation

ccnorm

Latin letters

Chinese characters

Contributing

About

Releases

Packages

Languages

License

brynne8/ccnorm

Folders and files

Latest commit

History

Repository files navigation

ccnorm

Latin letters

Chinese characters

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Languages