Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String endianness is incorrect: big-endian strings cannot be read on little-endian machines #5595

Open
BloCamLimb opened this issue Feb 26, 2024 · 0 comments
Assignees

Comments

@BloCamLimb
Copy link

According to the specification:

The character set is Unicode in the UTF-8 encoding scheme. The UTF-8 octets (8-bit bytes) are packed four per word, following the little-endian convention (i.e., the first octet is in the lowest-order 8 bits of the word).

This means that bytes must be swapped before reading a big-endian spv file on a little-endian machine, and vice versa. For example, glslang outputs big-endian encoded spv binary files on big-endian machines. But when disassembling this file via spirv-dis on little-endian machines, only words and operands are handled properly via spvFixWord, strings are handled in host endianness, which is not correct.

More specifically:
"GLSL.std.450" in big-endian encoding (or on big-endian machines), the first octet 'G' should be in the lowest-order byte, which is the fourth byte in a word. Then in the file (or memory), from the first byte to the last byte, from left to right is, 'L''S''L''G' 'd''t''s''.' '0''5''4''.' '\0''\0''\0''\0', 16 bytes and 4 words in total.
When reading this big-endian encoded file:
On big-endian machines, reinterpret the each consecutive 4 bytes as unit32_t, and use bit shift to obtain the first octet, like (word >> 0) & 0xFF. We will get the fourth byte, which is 'G', and this is correct.
On little-endian machines, the result of the bit operation is the first byte, which is 'L'. This is not correct, because there is no call to spvFixWord.

I'm making a compiler in Java myself and can selectively output spv binary files in little-endian or big-endian (the default is host endianness). I encountered this issue when running spirv-dis:

; SPIR-V
; Version: 1.5
; Generator: Khronos; 0
; Bound: 25
; Schema: 0
               OpCapability Shader
error: 2: Invalid extended instruction import 'LSLGdts.054.'

A related issue is #149 and PR #4622, but it does not fix this issue.

My spirv-dis version: SPIRV-Tools v2023.6 v2023.6.rc1-50-gdc667644
Here is my spv binary file for testing purposes, git describes this file as Khronos SPIR-V binary, big-endian, version 0x010500, generator 00000000, my CPU is little-endian
test_shader.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants