Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly split three-or-more byte sequences of UTF-8 #2123

Merged
merged 1 commit into from May 23, 2024

Conversation

BenWiederhake
Copy link
Contributor

The underlying bug was the assumption that uft8.DecodeLastRuneInString returns some kind of number of bytes that, when stripped from the end, leaves the string with a correct ending.

In reality, this function always returns the constant value 1 if the last rune is not valid.

Therefore, if there are two or more partial bytes of a three-or-more byte rune, this used to give the wrong result.

Found while trying to implement a related feature.

Copy link

codeclimate bot commented Mar 7, 2024

Code Climate has analyzed commit d9c1df7 and detected 0 issues on this pull request.

View more on Code Climate.

@42wim 42wim merged commit d055b45 into 42wim:master May 23, 2024
@42wim 42wim added this to the 1.27.0 milestone May 23, 2024
@42wim
Copy link
Owner

42wim commented May 23, 2024

Thanks 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants