use fully qualified / unified names in svg and png paths #422

BrianHung · 2020-07-05T05:41:12Z

This PR attempts to fix issues mentioned in #405 and #419 by using unicode fully qualified / unified names in paths for svgs and pngs.

To get the fully qualified name for an emoji, I used the emoji.json provided by https://github.com/iamcal/emoji-data and the following node script.

const fs = require('fs');
let emojiList = JSON.parse(fs.readFileSync("./emoji.json"))

// Flatten and append skin variations as a separate emojis to emojiList.
emojiList.filter(e => e.skin_variations)
  .forEach(e => emojiList = emojiList.concat(Object.values(e.skin_variations)))

function unifiedToNative(unified) {
  const codePoints = unified.split('-').map(u => `0x${u}`);
  return String.fromCodePoint.apply(String, codePoints);
}

// Convert unicode to native represetation.
emojiList.forEach(e => e.native = unifiedToNative(e.unified))

// Parse each native representation into a twemoji entity.
const { parse } = require('twemoji-parser');
emojiList.forEach(e => e.entity = parse(e.native)[0])

function getTwemojiUnicode(url) {
  return url.match(/([^\/]+)(?=\.\w+$)/)[0]
}

// Get the twemoji unicode representation from entity url.
emojiList.forEach(e => e.twemojiUnicode = getTwemojiUnicode(e.entity.url))

// Calculate the list of emojis where twemoji and unified or non_qualified differ.
let diff = emojiList.filter(e => e.twemojiUnicode !== e.unified.toLowerCase())
  .filter(d => d.twemojiUnicode !== "1f441") // BUG: see https://github.com/twitter/twemoji/issues/419

diff.forEach(e => { 
  fs.renameSync(`./assets/72x72/${e.twemojiUnicode}.png`, `./assets/72x72/${e.unified.toLowerCase()}.png`); 
  fs.renameSync(`./assets/svg/${e.twemojiUnicode}.svg`, `./assets/svg/${e.unified.toLowerCase()}.svg`); 
})

// To-do: manually handle 1f441.

The only exception to this was the eye emoji mentioned in #405, because both 👁️ and 👁️‍🗨️ resolve to "1f441" with the twemoji-parser. For the eye emoji, I had to manually rename two files.

CLAassistant · 2020-07-05T05:41:18Z

All committers have signed the CLA.

jdecked · 2020-10-13T21:52:33Z

Thanks for giving it a shot! Twemoji and twemoji-parser are intended to be interoperable as part of how we use them at Twitter, so we're working on a more complete solution internally to #405, hopefully by the end of the year. Since this breaks interoperability and would cause a pretty substantial divergence in our internal vs open sourced version of this package, I'm leaving it open for now.

JoshyPHP · 2020-11-15T01:18:25Z

In my opinion, it should be the other way around: instead of using a fully qualified sequence, remove all modifiers and variant selectors. For instance, U+FE0F (VS-16) exists to indicate that a character should be rendered as a colourful image rather than monochrome text. Since those files are already images, it's not needed. Same for U+200D (ZWJ) which is used to join several characters as one. It's already a single file so the joiner isn't meaningful.

In addition to being shorter, it's more robust against possible changes in future Unicode versions if some sequences are retooled to make some of those characters optional.

use fully qualified / unified names in svg png paths

c564081

BrianHung added 2 commits November 9, 2020 19:10

Merge branch 'master' of https://github.com/twitter/twemoji

bdb9710

🚧 run rename script on new emojis

d79306b

novacrazy mentioned this pull request Aug 28, 2022

Normalization and Fully-Qualified Names Lantern-chat/emoji#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use fully qualified / unified names in svg and png paths #422

use fully qualified / unified names in svg and png paths #422

BrianHung commented Jul 5, 2020

CLAassistant commented Jul 5, 2020 •

edited

jdecked commented Oct 13, 2020

JoshyPHP commented Nov 15, 2020

use fully qualified / unified names in svg and png paths #422

Are you sure you want to change the base?

use fully qualified / unified names in svg and png paths #422

Conversation

BrianHung commented Jul 5, 2020

CLAassistant commented Jul 5, 2020 • edited

jdecked commented Oct 13, 2020

JoshyPHP commented Nov 15, 2020

CLAassistant commented Jul 5, 2020 •

edited