Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misinterpreted @missing line in pre-15 emoji-data #739

Open
eggrobin opened this issue Mar 14, 2024 · 0 comments
Open

Misinterpreted @missing line in pre-15 emoji-data #739

eggrobin opened this issue Mar 14, 2024 · 0 comments
Assignees

Comments

@eggrobin
Copy link
Member

@macchiati pointed out to me that https://util.unicode.org/UnicodeJsps/character.jsp?a=0020&history=full&showDevProperties=1 shows the following history for the Emoji property of U+0020:

Emoji 8.0..14.0: Yes 15.0..16.0α: No

and that the space was not, in fact, an Emoji between 2015 and 2021.

I think the issue is that the earlier versions of emoji-data have an @missing line which does not follow the format of the file:

# @missing: 0000..10FFFF  ; Emoji ; No

0023          ; Emoji                # E0.0   [1] (#️)       hash sign
002A          ; Emoji                # E0.0   [1] (*️)       asterisk
0030..0039    ; Emoji                # E0.0  [10] (0️..9️)    digit zero..digit nine

The parser interprets that as # @missing: 0000..10FFFF ; Emoji, and throws the No into the U+1F5D1 🗑️ WASTEBASKET.
This needs some special handling in PropertyParsingInfo.java.

@eggrobin eggrobin self-assigned this Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant