Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reader.read_uint32 causes reader head misalignment when consuming oversized VARINTs in some cases #1948

Open
SimplyLinn opened this issue Nov 23, 2023 · 0 comments

Comments

@SimplyLinn
Copy link

protobuf.js version: latest master

I'm looking into reading a uint32 value directly from a without instantiating a reader, and I was looking at this official implementation for reference, and this section stumped me a little. I ended up digging through some documentation and playing with the implementation a bit, and I think what it does is that, if the last byte of the uint32 has the "has more data"-bit set, we assume it's a uint64, and consume those bytes as well.

The potential issue I've picked up on is if it's not a full 10-byte VARINT, but, say a 6, 7, 8 or 9 byte VARINT.

The actual numerical data in that case only takes between 6 and 9 bytes, but the reader unconditionally consumes the full 10, leaving the reader position-head in a misaligned position.

import protobuf from 'protobufjs';

function example1() {
  const writer = protobuf.Writer.create();
  writer.uint64((2n ** 53n - 1n).toString());
  writer.uint32(42069);
  writer.uint32(42069);
  const arr = writer.finish();
  console.log(
    '~~Example 1: uint64 write is 8 bytes long~~\nUint8Array Length:',
    arr.length,
  );
  const reader = protobuf.Reader.create(arr);
  try {
    console.log('First uint32 read:', reader.uint32());
  } catch (err) {
    console.error('First uint32 read error:', err.message);
  }
  try {
    console.log('Second uint32 read:', reader.uint32());
  } catch (err) {
    console.log('Second uint32 read error:', err.message);
  }
  try {
    console.log('Third uint32 read:', reader.uint32());
  } catch (err) {
    console.log('Third uint32 read error:', err.message);
  }
}

function example2() {
  const writer = protobuf.Writer.create();
  writer.uint64((2n ** 64n - 1n).toString());
  writer.uint32(42069);
  writer.uint32(42069);
  const arr = writer.finish();
  console.log(
    '\n~~Example 2: uint64 write is 10 bytes long~~\nUint8Array Length',
    arr.length,
  );
  const reader = protobuf.Reader.create(arr);
  try {
    console.log('First uint32 read:', reader.uint32());
  } catch (err) {
    console.error('First uint32 read error:', err.message);
  }
  try {
    console.log('Second uint32 read:', reader.uint32());
  } catch (err) {
    console.log('Second uint32 read error:', err.message);
  }
  try {
    console.log('Third uint32 read:', reader.uint32());
  } catch (err) {
    console.log('Third uint32 read error:', err.message);
  }
}

example1();
example2();
~~Example 1: uint64 write is 8 bytes long~~
Uint8Array Length: 14
First uint32 read: 4294967295
Second uint32 read: 2
Third uint32 read: 42069

~~Example 2: uint64 write is 10 bytes long~~
Uint8Array Length 16
First uint32 read: 4294967295
Second uint32 read: 42069
Third uint32 read: 42069

I created an example to demonstrate the inconsistency, notice how the second uint32 read of example 1 is wildly wrong because parts of it was already naively consumed by the code I highlighted in my description. In example 2, the written uint64 takes up the assumed 10 bytes and the head ends up correctly aligned for the second uint32 read. In this trivial case, the read-head rediscovers the value boundaries and the value of the next read was just slightly wrong, but the misalignment could cause a lot more weirdness should the next data be something else.

The decision of whether the reader should throw an error if the VARINT is oversized, or if a more robust "end of VARINT" check should be performed to find the actual boundary of the value I leave to the maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant