Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
string.c (mrb_utf8_strlen): handle invalid UTF-8 sequence; fix #6255
Previous SWAR version assumes valid UTF-8 to count number of code points in the string, but we need to handle invalid sequence as well. We now use `search_nonascii` to skip counting single byte characters for performance. The new version is even faster than SWAR version (probably because `search_nonascii` uses SSE2 on Intel compatible CPU (which I use).
- Loading branch information