Right bit-shift on signed integer #12764

mhs4670go · 2024-03-18T09:57:15Z

How packed int4(s4) values are converted to int16(s16)?

Let's assume two s4 values (AXXX, BYYY) are packed in S8 as below (A, B are sign bits for each s4 value).

AXXX BYYY

LSB: << 4 and >> 4, BBBB BYYY
MSB: >> 4, AAAA AXXX

Originally posted by @jinevening in #12743 (comment)

I'm not sure that it is safe or not. Maybe we can check it more detail.

https://stackoverflow.com/questions/7522346/right-shift-and-signed-integer

INT34-C. Do not shift an expression by a negative number of bits or by greater than or equal to the number of bits that exist in the operand

The text was updated successfully, but these errors were encountered:

chunseoklee · 2024-03-19T01:33:45Z

@mhs4670go You care about it(sign extension) is platform/implement dependent ?

mhs4670go · 2024-03-19T01:45:40Z

@mhs4670go You care about it(sign extension) is platform/implement dependent ?

Right. I think most of platforms are safe but technically speaking it seems not safe actually.

chunseoklee · 2024-03-19T02:06:07Z

Anyway, both ggml and tflite use the shift you mentioned for pack and unpack :)

hseok-oh · 2024-03-19T10:08:15Z

Anyway, both ggml and tflite use the shift you mentioned for pack and unpack :)

ggml is using unsigned 8bit for two unsigned 4bit. So it's no problem.

TFLite is using signed 8bit for two signed 4bit. As @chunseoklee commented, it uses shift operation.

void UnpackDenseInt4IntoInt8(const int8_t* src_buffer, int num_elements,
                             int8_t* dst_buffer) {
  for (int i = 0; i < num_elements / 2; i++) {
    int8_t byte = src_buffer[i];
    // Shift left first so that sign is properly extended when shifted right
    int8_t lower = static_cast<int8_t>(byte << 4) >> 4;
    int8_t higher = byte >> 4;
    dst_buffer[2 * i] = lower;
    dst_buffer[2 * i + 1] = higher;
  }

  // If the buffer size is odd, extract the final lower nibble.
  if (num_elements % 2 != 0) {
    dst_buffer[num_elements - 1] =
        static_cast<int8_t>(src_buffer[num_elements / 2] << 4) >> 4;
  }
}

I feel it is not safe...

SlavikMIPT · 2024-04-02T11:54:15Z

I feel it is not safe...

Agree - as far as I know the standard - it is not UB, but implementation defined - we can get unwanted 1/0 in higher bits. Of course we can add wrappers like byte & 0x0F - to be sure that upper bits are zeroes, but I think it simplier to use uint8_t from the start to minimize future errors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Right bit-shift on signed integer #12764

Right bit-shift on signed integer #12764

mhs4670go commented Mar 18, 2024

chunseoklee commented Mar 19, 2024

mhs4670go commented Mar 19, 2024

chunseoklee commented Mar 19, 2024

hseok-oh commented Mar 19, 2024 •

edited

SlavikMIPT commented Apr 2, 2024 •

edited

Right bit-shift on signed integer #12764

Right bit-shift on signed integer #12764

Comments

mhs4670go commented Mar 18, 2024

How packed int4(s4) values are converted to int16(s16)?

chunseoklee commented Mar 19, 2024

mhs4670go commented Mar 19, 2024

chunseoklee commented Mar 19, 2024

hseok-oh commented Mar 19, 2024 • edited

SlavikMIPT commented Apr 2, 2024 • edited

hseok-oh commented Mar 19, 2024 •

edited

SlavikMIPT commented Apr 2, 2024 •

edited