Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTS46 IdnaTestV2.txt: add 5 normalization corrections #687

Closed
markusicu opened this issue Feb 6, 2024 · 3 comments · Fixed by #829
Closed

UTS46 IdnaTestV2.txt: add 5 normalization corrections #687

markusicu opened this issue Feb 6, 2024 · 3 comments · Fixed by #829

Comments

@markusicu
Copy link
Member

Add test cases for the five characters whose Decomposition_Mapping's were corrected in Unicode 4.0:

Include strings with both the actual characters and their Punycode forms. For example, test with both

  • \U0002F9BF.com
  • xn--8c3n.com

As @hsivonen found, for these five characters it makes a difference whether the UTS46 implementation leaves them in the input until normalization (as the spec says), or whether disallowed+mapping+normalization treats them like any other disallowed character (like ICU does).

The characters should be normalized to valid ones, while when they occur inside Punycode they are disallowed.

See https://util.unicode.org/UnicodeJsps/idna.jsp?a=%5CU0002F9BF.com%0D%0Axn--8c3n.com%0D%0Axn--gro.com

@macchiati @eggrobin

@markusicu
Copy link
Member Author

I say that this is no longer necessary, since in Unicode 16 we no longer treat these 5 in any special way. They are just mapped consistent with their Decomposition_Mapping's.

@hsivonen
Copy link
Member

hsivonen commented May 8, 2024

I seems prudent to have tests for these characters given the history.

@eggrobin
Copy link
Member

eggrobin commented May 8, 2024

I tend to agree with @hsivonen here; for instance, an implementation could be special-casing them because that used to be needed, and we would want to catch that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants