Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WAMR throw OOB exception in LLVM-JIT mode while not in Fast-JIT mode #3343

Open
hungryzzz opened this issue Apr 22, 2024 · 3 comments
Open
Labels

Comments

@hungryzzz
Copy link

hungryzzz commented Apr 22, 2024

Subject of the issue

I run the following wasm code in WAMR, and got Exception: out of bounds memory access in LLVM-JIT mode, but run successfully in Fast-JIT mode and AOT mode.

Test case

(module
  (type (;0;) (func))
  (type (;1;) (func (param i32)))
  (type (;2;) (func (param i32 i32 i32 i32) (result i32)))
  (import "wasi_snapshot_preview1" "proc_exit" (func (;0;) (type 1)))
  (import "wasi_snapshot_preview1" "fd_write" (func (;1;) (type 2)))
  (func (;2;) (type 0)
    i32.const 0
    i32.const 255
    i32.store8
    f64.const nan (;=nan;)
    i32.const 0
    f64.load
    f64.const 0x0p+0 (;=0;)
    f64.mul
    f64.mul
    global.set 0
    i32.const 0
    global.get 0
    f64.store
    i32.const 27
    global.get 0
    f64.store)
  (func (;3;) (type 0)
    call 2
    call 2
    i32.const 0
    i32.const 16
    i32.const 2
    i32.const 0
    call 1
    drop
    i32.const 0
    call 0
    unreachable)
  (memory (;0;) 8192 8192)
  (global (;0;) (mut f64) (f64.const 0x0p+0 (;=0;)))
  (export "memory" (memory 0))
  (export "_start" (func 3)))

Your environment

  • Host OS: Linux ringzzz-OptiPlex-7070 5.15.0-97-generic
  • WAMR version: 7bdea3c
  • cpu architecture: Intel(R) Core(TM) i5-9500T

Expected & Actual behavior

截屏2024-04-22 22 39 51

Extra info

I found that after I replaced f64.const nan to f64.const 0 in function 2, the execution result of LLVM-JIT mode would be correct, so I wondered the bug may be related to nan. However, if I only called function 2 once(i.e., deleted one of call 2 in function 3), the bug would also disappear, which indicated that the bug may be not only related to nan.

@hungryzzz hungryzzz changed the title WAMR throw OOB exception in LLVM-JIT mode while doesn't in Fast-JIT mode WAMR throw OOB exception in LLVM-JIT mode while not in Fast-JIT mode Apr 22, 2024
@wenyongh
Copy link
Contributor

Hi, thanks for reporting the issue. I did some experiment, it is caused by the meta data setting to the llvm fmul intrinsic, if I changed it from "fpexpect.strict" to "fpexpect.ignore", the result of llvm-jit/aot is the same as the result of fast-jit:

diff --git a/core/iwasm/compilation/aot_llvm.c b/core/iwasm/compilation/aot_llvm.c
index 3af56e8b..ead8a203 100644
--- a/core/iwasm/compilation/aot_llvm.c
+++ b/core/iwasm/compilation/aot_llvm.c
@@ -2504,7 +2504,7 @@ aot_create_comp_context(const AOTCompData *comp_data, aot_comp_option_t option)
     char *cpu = NULL, *features, buf[128];
     char *triple_norm_new = NULL, *cpu_new = NULL;
     char *err = NULL, *fp_round = "round.tonearest",
-         *fp_exce = "fpexcept.strict";
+         *fp_exce = "fpexcept.ignore";
     char triple_buf[128] = { 0 }, features_buf[128] = { 0 };
     uint32 opt_level, size_level, i;
     LLVMCodeModel code_model;

It affects the result the second fmul in the second time calling of func 2:

(func (;2;) (type 0)
    i32.const 0
    i32.const 255
    i32.store8

    f64.const nan (;=nan;)

    i32.const 0
    f64.load
    f64.const 0x0p+0 (;=0;)
    f64.mul

    f64.mul   => The two inputs are: 7ff8000000000000 and 7ff80000000000ff,
                         when using fpexcept.strict, the mul result is: 7ff80000000000ff
                         when using fpexcept.ignore, the mul result is: 7ff8000000000000

@hungryzzz
Copy link
Author

Thank you for your reply! But I still confused that what makes the different JIT modes to generate different binary sequence which cause different multiplication results? In addition, could you please explain how to pinpoint the buggy instructions if it is convenient. Thanks again!

@wenyongh
Copy link
Contributor

The LLVM-JIT leverages LLVM framework while FAST-JIT's framework is self-implemented in WAMR, their pipelines and codegens are different, so the result may be different, sometimes we have to check the LLVM IR and related attributes for it.

For pinpoint the buggy instructions, normally I first compared the execution results of each wasm opcode between two running modes, to achieve that, you may refactor the wasm opcodes, e.g. comment out the opcodes after the opcode to check and change the function result type if needed, and let iwasm print the result of the opcode, and check whether results of two running modes are different.

Another possible method is to use wasm-interp of wabt to trace the execution result, e.g. /opt/wabt/bin/wasm-interp -t -r <func> <wasm file>. And then use the AOT trace feature in this PR: #2647. But it is in experiment stage and we have no bandwidth to finish it yet.

And you can also dump the LLVM IR, e.g. wamrc --format=llvmir-unopt -o test.ll test.wasm, or dump the object file wamrc --format=object -o test.o test.wasm and then dump the machine code with objump -d test.o.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants