You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PartiQL can get the ordering of strings wrong if they contain surrogate pairs.
To Reproduce
This is a somewhat contrived example, but it demonstrates the point.
@Test
fun `lexical ordering of strings with surrogate pairs`() {
// The codepoint of 'ꬰ' is U+AB30.
// The codepoint of `💩` is U+1F4A9.
// Therefore, `ꬰ` should be ordered first by PartiQL.
// However, PartiQL currently falls back on the JVM to compare strings. The JVM lexicographcailly
// compares by UTF-16 code unit instead of full code point and this can cause strings with characters
// requiring surrogate pairs to sort incorrectly.
// Therefore this test fails.
assertTrue(
DEFAULT_COMPARATOR.compare(
ExprValue.newString("ꬰ"),
ExprValue.newString("💩")
) > 0,
"'ꬰ' should come before '💩'"
)
}
Expected Behavior
The test in the repro case should pass.
Additional Context
I can't think of anything else to add.
The text was updated successfully, but these errors were encountered:
Description
PartiQL can get the ordering of strings wrong if they contain surrogate pairs.
To Reproduce
This is a somewhat contrived example, but it demonstrates the point.
Expected Behavior
The test in the repro case should pass.
Additional Context
I can't think of anything else to add.
The text was updated successfully, but these errors were encountered: