New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compile / decompile emojis #7861
base: master
Are you sure you want to change the base?
Conversation
…ster :(" This reverts commit 596e90c.
@@ -15,12 +15,18 @@ | |||
<value name="VALUE"> | |||
<shadow type="math_number"><field name="NUM">0</field></shadow> | |||
<block type="spritescreate"> | |||
<field name="img">img&#96;&#10;0&#10;&#10;&#10;1&#10;2&#10;3&#96;</field> | |||
<field name="img">img` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't look right... shouldn't newlines get escaped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that they shouldn't need to be escaped in xml like that re: https://www.w3.org/TR/xml-c14n/#Example-WhitespaceInContent
I believe this example would break on decompiling the image if it weren't being read correctly, right? As the https://arcade.makecode.com/app/0610dda95754bed7e6df6ce7485f7d128dd0f4ea-1375337f35#pub:_Rd0UJmiP0RWq
can always add a few more values to be escaped if we're worried about it, would probably be better to just use https://www.npmjs.com/package/he if we want the full solution though as the current escaping is broken -- e.g. it's causing crashes due to malformed encodings here microsoft/pxt-arcade#2774
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iiirc we had bugs around that before. can't remember if it was us or blockly
i wonder if we could do a mapping to the unicode names instead? Might look a bit nicer (but would definitely be more complicated). |
For using the unicode names, I believe the values should be the same in general -- that is, it's currently encoding |
@jwunderl I meant the name of the emoji. e.g. "grinningFace" and "personInSuitLevitating" |
…CompileDecompile
…CompileDecompile
Okay finally brought this up to date, I'm gonna try and get this in for next arcade release :) here's a test build, everything appears to still be working but I'll poke around a bit more if we wanted to use the english name for the emoji we'd need to do something like fetch it from https://github.com/unicode-org/cldr which is ... a bit much. We could also try base-36'ing it, which benefit to that is that it appears to be pretty consistently 3-4 chars per id instead of 4-6 then~ |
Is this PR still valid? |
I like using emojis for examples sometimes but it gets bad when you compile (just some combination of
_
and numbers), so this would let allow emojis to persist between compile -> decompile:Obviously the generated names aren't great (they're just "E" + emoji's codepoint in base 16 + "X", maybe could use $ to surround instead as that's also valid / uncommonly used anyways) but I don't know it's any worse than just
_
/_2
/ etc; starting as a draft though as I don't know if we'd want to take it necessarily for this reasontest build with a sample project: https://arcade.makecode.com/app/0610dda95754bed7e6df6ce7485f7d128dd0f4ea-1375337f35#pub:_XLqH792iMbks
Most of the diff is from https://github.com/microsoft/pxt/pull/7861/files#diff-b3b92c2632ab9b3508e5ea8cd2f72f4d5829fbfb8ef35ab61303cc85d61b93aeR30, which I needed to change here as it produces invalid encodings for most emojis (unicode 'astral plane' codepoints) / breaks anything above
0xffff
; as far as I can tell html / xml with utf8 encoding should only ever need to encode<>"'&
special anyway though so this seems fine / better anyways? e.g. https://www.w3.org/International/questions/qa-escapes#use & https://hsivonen.fi/producing-xml/#noescapingI messed up switching between branches after my local version of the branch broke and accidentally pulled when I should have checkout'd, so I accidentally merged this into master / reverted it already D:. The commits from when I wrote it are in 8e3181e...bb4281a, and I reverted that in 596e90c, so this pr would... revert that.
edit added support for debugger view as well, and made that show unescaped names (e.g. a variable
text list
currently shows astext_list
in arcade/live, but with this shows without the space escaped):also fixes microsoft/pxt-arcade#2774