Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec should be very clear how to handle duplicate keys #259

Open
Gozala opened this issue Dec 19, 2022 · 5 comments
Open

Spec should be very clear how to handle duplicate keys #259

Gozala opened this issue Dec 19, 2022 · 5 comments
Assignees

Comments

@Gozala
Copy link

Gozala commented Dec 19, 2022

We end up having a lengthy discussion about duplicate key exploits in signed data https://github.com/ChainAgnostic/varsig/pull/6/files#r1038482364 where it become clear that current IPLD specification is not very clear about this, in fact case could be made that following section suggests such exploits are possible

Codec implementations **MUST** do the following when encoding data in order to ensure hashes consistently match for the same block data.
- Sort object keys by their (UTF-8) encoded representation, i.e. with byte comparisons
- Strip whitespace
This produces the most compact and consistent representation which will ensure that two codecs
producing the same data end up with matching block hashes.
Codec implementers should not enforce this strictness when decoding data in order to support historical data, and data produced by non-strict encoders. However, they may provide an opt-in for systems where round-trip determinism is a desireable feature and backward compatibility with old, non-strict data is unnecessary.

It was also unclear how individual implementations would behave in the face of blocks with duplicate keys, which again suggests that this subtle detail shouldn't be subtle, but rather should be well articulated and implementations should communicate how they'd behave in such scenario.

@rvagg
Copy link
Member

rvagg commented Jan 2, 2023

I think we should change dag-json and dag-pb to have a **SHOULD** reject blocks with duplicate keys in maps. I know we're going to get some grief for potential slow-downs in decoding.

I believe Go should already be rejecting dupes: https://github.com/ipld/go-ipld-prime/blob/33475f04486e763616962f4dd2bda67b31758e27/node/basicnode/map.go#L249-L253

We probably ought to do an key in obj check and reject with the same thing in JS, so I'd consider this a bug: https://github.com/rvagg/cborg/blob/f05397b8df2d7a063f0d35cc756d74ce5cb2f4f1/lib/decode.js#L117

@Gozala would you care to draft some language for the specs to get this started?

@rvagg
Copy link
Member

rvagg commented Jan 2, 2023

Another case for negative fixtures .. we have no way of asserting this kind of condition across our codecs.

FWIW dag-pb no longer allows this even though it's entirely valid PB and the old codecs used to allow it. As of the latest generation, Go and JS will reject blocks where there are duplicates of anything.

@BigLep
Copy link
Contributor

BigLep commented Jan 4, 2023

2023-01-03 IPLD triage conversation:

  1. We're hoping @Gozala can draft spec language per Spec should be very clear how to handle duplicate keys #259 (comment)
  2. IPLD maintainers will do the JS fix (@rvagg is creating an issue now)
  3. @rvagg will create a tracking issue for negative test fixtures

@rvagg
Copy link
Member

rvagg commented Jan 4, 2023

Some issues as follow-up to get them on the board:

This issue can serve as a placeholder for fixing the spec.

rvagg added a commit to ipld/codec-fixtures that referenced this issue Jan 6, 2023
rvagg added a commit to ipld/codec-fixtures that referenced this issue Jan 6, 2023
rvagg added a commit to ipld/js-dag-cbor that referenced this issue Jan 6, 2023
rvagg added a commit to ipld/js-dag-cbor that referenced this issue Jan 6, 2023
rvagg added a commit to ipld/js-dag-cbor that referenced this issue Jan 6, 2023
rvagg added a commit to ipld/js-dag-cbor that referenced this issue Jan 6, 2023
rvagg added a commit to ipld/js-dag-cbor that referenced this issue Jan 6, 2023
rvagg added a commit to ipld/js-dag-json that referenced this issue Jan 6, 2023
rvagg added a commit to ipld/js-dag-json that referenced this issue Jan 6, 2023
rvagg added a commit to ipld/codec-fixtures that referenced this issue Jan 6, 2023
rvagg added a commit to ipld/codec-fixtures that referenced this issue Jan 6, 2023
@DavidBuchanan314
Copy link
Contributor

By the way, you can detect duplicate map keys during parsing for almost-free in most cases, by keeping track of how many keys you've added, and then checking if the size of the map matches what you expect once you reach the end of it - as opposed to doing a key in obj check (or similar) before each new addition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🥞 Todo
Development

No branches or pull requests

4 participants