You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As of recent commits, qemu_arc_em is failing in the tests/kernel/mem_protect/syscalls test with:
START - test_syscall_torture
Running syscall torture test with 4 threads on 1 cpu(s)
E: ***** Exception vector: 0x2, cause code: 0x1, parameter 0x0
E: Address 0x800014a4
E: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
E: Current thread: 0x804009f0 (unknown)
E: Halting system
The proximate cause was (hilariously) that the patch count since the last release candidate had reached 100. This caused the version string printed by the boot banner to be one byte longer and exposed the bug. (I actually got a test rig created where two Zephyr binaries that differed ONLY in whether the last byte of a fake banner string was a "x" or a newline would differ in crash behavior).
But as it turns out that's all just timing interaction. The real problem happens in the heap code, and is a compiler bug. @ruuddw pointed out that the fault (at 0x800014a4) is actually flagging an illegal instruction in a branch delay slot. And indeed, the disassembly below shows that the faulting instruction is a ENTER_S (one of the forbidden instructions), and that the instruction immediately preceding is indeed an unconditional branch!
This is a clear optimizer bug. The generated code can't be allowed to emit a linker section that ends on an instruction with a branch delay slot, because it can't control the instruction the linker will place next. And indeed, stuffing a NOP instruction at the end of the function fixes the symptom
Note that as pointed out be @abrodkin the branch instruction at 0x800014a0 is in fact the non-delay-slot variant that does not execute the next instruction on taken branches (the other variant would disassemble as "b.d").
And if that's the case, then this bug is actually in qemu and not gcc. Though I guess that's still against the toolchain so it should stay open.
[1] Apparently ARC has... both? Not sure how that provides value since the whole idea behind delay slots is that they mean you don't need interlocks in the pipeline. Now ARC needs the pipeline handling and has extra instructions to decode?
[1] best of both worlds: performance and code density in case the cycle for the branch cannot be used with a branch delay slot instruction. Saves a nop instruction.
(Cut/pasted from zephyrproject-rtos/zephyr#54720)
As of recent commits, qemu_arc_em is failing in the tests/kernel/mem_protect/syscalls test with:
The proximate cause was (hilariously) that the patch count since the last release candidate had reached 100. This caused the version string printed by the boot banner to be one byte longer and exposed the bug. (I actually got a test rig created where two Zephyr binaries that differed ONLY in whether the last byte of a fake banner string was a "x" or a newline would differ in crash behavior).
But as it turns out that's all just timing interaction. The real problem happens in the heap code, and is a compiler bug. @ruuddw pointed out that the fault (at 0x800014a4) is actually flagging an illegal instruction in a branch delay slot. And indeed, the disassembly below shows that the faulting instruction is a ENTER_S (one of the forbidden instructions), and that the instruction immediately preceding is indeed an unconditional branch!
This is a clear optimizer bug. The generated code can't be allowed to emit a linker section that ends on an instruction with a branch delay slot, because it can't control the instruction the linker will place next. And indeed, stuffing a NOP instruction at the end of the function fixes the symptom
The text was updated successfully, but these errors were encountered: