Reader.read_uint32 causes reader head misalignment when consuming oversized VARINTs in some cases #1948

SimplyLinn · 2023-11-23T21:59:28Z

protobuf.js version: latest master

I'm looking into reading a uint32 value directly from a without instantiating a reader, and I was looking at this official implementation for reference, and this section stumped me a little. I ended up digging through some documentation and playing with the implementation a bit, and I think what it does is that, if the last byte of the uint32 has the "has more data"-bit set, we assume it's a uint64, and consume those bytes as well.

The potential issue I've picked up on is if it's not a full 10-byte VARINT, but, say a 6, 7, 8 or 9 byte VARINT.

The actual numerical data in that case only takes between 6 and 9 bytes, but the reader unconditionally consumes the full 10, leaving the reader position-head in a misaligned position.

import protobuf from 'protobufjs';

function example1() {
  const writer = protobuf.Writer.create();
  writer.uint64((2n ** 53n - 1n).toString());
  writer.uint32(42069);
  writer.uint32(42069);
  const arr = writer.finish();
  console.log(
    '~~Example 1: uint64 write is 8 bytes long~~\nUint8Array Length:',
    arr.length,
  );
  const reader = protobuf.Reader.create(arr);
  try {
    console.log('First uint32 read:', reader.uint32());
  } catch (err) {
    console.error('First uint32 read error:', err.message);
  }
  try {
    console.log('Second uint32 read:', reader.uint32());
  } catch (err) {
    console.log('Second uint32 read error:', err.message);
  }
  try {
    console.log('Third uint32 read:', reader.uint32());
  } catch (err) {
    console.log('Third uint32 read error:', err.message);
  }
}

function example2() {
  const writer = protobuf.Writer.create();
  writer.uint64((2n ** 64n - 1n).toString());
  writer.uint32(42069);
  writer.uint32(42069);
  const arr = writer.finish();
  console.log(
    '\n~~Example 2: uint64 write is 10 bytes long~~\nUint8Array Length',
    arr.length,
  );
  const reader = protobuf.Reader.create(arr);
  try {
    console.log('First uint32 read:', reader.uint32());
  } catch (err) {
    console.error('First uint32 read error:', err.message);
  }
  try {
    console.log('Second uint32 read:', reader.uint32());
  } catch (err) {
    console.log('Second uint32 read error:', err.message);
  }
  try {
    console.log('Third uint32 read:', reader.uint32());
  } catch (err) {
    console.log('Third uint32 read error:', err.message);
  }
}

example1();
example2();

~~Example 1: uint64 write is 8 bytes long~~
Uint8Array Length: 14
First uint32 read: 4294967295
Second uint32 read: 2
Third uint32 read: 42069

~~Example 2: uint64 write is 10 bytes long~~
Uint8Array Length 16
First uint32 read: 4294967295
Second uint32 read: 42069
Third uint32 read: 42069

I created an example to demonstrate the inconsistency, notice how the second uint32 read of example 1 is wildly wrong because parts of it was already naively consumed by the code I highlighted in my description. In example 2, the written uint64 takes up the assumed 10 bytes and the head ends up correctly aligned for the second uint32 read. In this trivial case, the read-head rediscovers the value boundaries and the value of the next read was just slightly wrong, but the misalignment could cause a lot more weirdness should the next data be something else.

The decision of whether the reader should throw an error if the VARINT is oversized, or if a more robust "end of VARINT" check should be performed to find the actual boundary of the value I leave to the maintainers.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reader.read_uint32 causes reader head misalignment when consuming oversized VARINTs in some cases #1948

Reader.read_uint32 causes reader head misalignment when consuming oversized VARINTs in some cases #1948

SimplyLinn commented Nov 23, 2023

Reader.read_uint32 causes reader head misalignment when consuming oversized VARINTs in some cases #1948

Reader.read_uint32 causes reader head misalignment when consuming oversized VARINTs in some cases #1948

Comments

SimplyLinn commented Nov 23, 2023