Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport adraffy/ens-normalize@1.10.1 to ethers@5 #4577

Open
wants to merge 6 commits into
base: v5.7
Choose a base branch
from

Conversation

adraffy
Copy link

@adraffy adraffy commented Feb 4, 2024

  • Replaced ens-normalize/ directory with almost 1:1 port w/types
    • replaced Set with {[x]: x} and membership with set[x] or x in set
    • replaced Map with {}
    • replaced class Emoji with extra wrapper
  • validate.ts runs ens_normalize() through tests.json → no errors

  • Simplified ens-normalize()-related calls in namehash.ts, functionality unchanged
  • Added max argument to dnsEncode()

@ricmoo ricmoo added enhancement New feature or improvement. on-deck This Enhancement or Bug is currently being worked on. minor-bump Planned for the next minor version bump. v5 Issues regarding legacy-v5 next-patch Issues scheduled for the next arch release. labels Feb 28, 2024

// given a list of codepoints
// returns a list of lists, where emoji are a fully-qualified (as Array subclass)
// eg. explode_cp("abc💩d") => [[61, 62, 63], Emoji[1F4A9, FE0F], [64]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No non-ASCII7 characters are allowed in the source code (applies to v6 as well as v5) as they can cause issues in some editors, over various transports, etc. It also helps prevent "trojan" code.

@ricmoo
Copy link
Member

ricmoo commented Mar 19, 2024

A few quick notes I've noticed in a few places:

  • all imports that reference local files must include the extension, otherwise lots of things break; e.g. "import foo from "./foo.js" instead of "import foo from "./foo".
  • I avoid non-ASCII7 characters in source as this eliminates issues in many console-based editors and terminal emulators, as well as prevents any concerns regarding trojan code exploits, including in comments and jsdocs

There are a lot of changes that spook me every time I read over this PR. Is it possible to not include the nf component for this v5 change? Is there a way to mostly just update the base64-encoded data and add an extra function or so where necessary? Just to minimize the changes?

I also need to analyze the size with this change a bit more... Do you have any info on the size difference in the /packages/ethers/dist/ files?

@adraffy
Copy link
Author

adraffy commented Mar 24, 2024

  • all imports that reference local files must include the extension, otherwise lots of things break; e.g. "import foo from "./foo.js" instead of "import foo from "./foo".

I can make this change and remove the emoji from the comments, I was just trying to satisfy whatever TSC was saying.

I 100% agree that the extension-less includes are insane. I have a hard time reasoning about TSC builds as everyone does something different.


There were significant changes from your early port of normalize to the final, eg. script (group) and confusable logic.

In general, I can't do much about the file size, as it's the only the solution to:
"you have a JS engine with unknown versions of Unicode, what do?"

With async import, which I assume is not allowed in v5, I could dynamically import significantly smaller norm libraries using runtime capability checks, but I'm unsure how to package code like that, when this is just a small library in a framework in someone else's project.

You don't need the NF part but considering this is a legacy build that's going to deployed on aging platforms, it's literally the use-case for a system-independent version of Unicode NF. The issue is that the % of names impacted by NF divergences is small. My goal was to make it so ens_normalize() works without any outside feature assumptions.

I can replace this with the system shim: https://github.com/adraffy/ens-normalize.js/blob/main/src/nf-native.js


The other changes just involve removing duplicated logic.

  • Names that are normalized can be split on "."
  • Names that are normalized round-trip picky UTF-8 coders (eg. Text{En|De}coder)
  • dnsEncode() should use 255 char limit as discussed previously

The rest of the code is exactly the same as my currently library except the arguments are typed to satisfy TSC.

@adraffy
Copy link
Author

adraffy commented Mar 24, 2024

  • files should be ASCII7
  • fixed mangled decoder.ts --- I'm still unclear how my PR got screwed up
  • changed extensions to explicit
  • removed validate/tests.json
  • changed nf to system

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or improvement. minor-bump Planned for the next minor version bump. next-patch Issues scheduled for the next arch release. on-deck This Enhancement or Bug is currently being worked on. v5 Issues regarding legacy-v5
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants