Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfaulting the JVM #58

Open
helins opened this issue Mar 2, 2021 · 4 comments
Open

Segfaulting the JVM #58

helins opened this issue Mar 2, 2021 · 4 comments
Assignees
Labels
🐞 bug Something isn't working
Projects

Comments

@helins
Copy link

helins commented Mar 2, 2021

Describe the bug

I am currently developing WASM tooling and I am using wasmer-java interactively from Clojure. Sometimes, after a while, the JVM suddenly segaults because of Wasmer:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fe1df8c4f91, pid=27960, tid=27980
#
# JRE version: OpenJDK Runtime Environment (Zulu11.39+15-CA) (11.0.7+10) (build 11.0.7+10-LTS)
# Java VM: OpenJDK 64-Bit Server VM (11.0.7+10-LTS, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [wasmer_jni2309101336629026624.lib+0x5ef91]  _$LT$hashbrown..raw..RawTable$LT$T$GT$$u20$as$u20$core..ops..drop..Drop$GT$::drop::hfb03333fcdd0f7eb+0x31
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to /home/adam/projects/clj/helins/wasmeta/core.27960)
#
# An error report file with more information is saved as:
# /home/adam/projects/clj/helins/wasmeta/hs_err_pid27960.log
[thread 28012 also had an error]
#
# If you would like to submit a bug report, please visit:
#   http://www.azulsystems.com/support/
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
rlwrap: warning: clojure crashed, killed by SIGABRT (core dumped).
rlwrap itself has not crashed, but for transparency,
it will now kill itself with the same signal

And here is the full report: https://gist.github.com/helins-io/ffbb4e46eaf5b4adbfc960952ac577e9

Steps to reproduce

In an interactive session (using a Clojure REPL), I often load a WASM file, create an instance, execute a function and close that instance right away. At first nothing happens. However, after a while, I get this SIGSEV.

Additional context

"After a while" propably means when GC kicks in. I guess this could be a double free error where finalizing the instance object tries to free pointer(s) which were already freed manually when closing the instance myself. Sounds plausible after skimming at the report.

@helins helins added the 🐞 bug Something isn't working label Mar 2, 2021
@Hywan Hywan added this to 📬 Backlog in Kanban via automation Mar 2, 2021
@Hywan Hywan self-assigned this Mar 2, 2021
@Hywan Hywan moved this from 📬 Backlog to 🏁 Ready in Kanban Mar 2, 2021
@Hywan
Copy link
Contributor

Hywan commented Mar 2, 2021

Thank you for the detailed report! Indeed, it looks like a double-free. That's curious. Can you provide a minimal working example so that I can try to reproduce please?

@helins
Copy link
Author

helins commented Mar 2, 2021

It's part of a bigger library I am currently writing. It's still messy, I guess I could maybe invite you on my private repo? Are you comfortable at all with Clojure?

But anyways, there really isn't much to it. The parts that leverage Wasmer are a very direct wrapper over the Java API. Essentially, it's exactly the same as using the Java API straight away.

Maybe the problem is due to this debug-like behavior: creating an instance, calling a function right way, and closing it right away. Could this fast cycle be problematic? In a real application you would probably hold on the instance for a bit, that's maybe why no one had this issue before.

@helins
Copy link
Author

helins commented Mar 2, 2021

I believe the best thing to do is to simply remove .finalize. I remember reading that it is discouraged in newer Java versions as it is unreliable, especially regarding native resources (eg. you never know when the finalization happens). Either the user should release native resources explicitly (as in .close) or there are better ways than .finalize for automatically managing those resources. I am not really familiar with those but you might want to checkout phantom references.

@Hywan
Copy link
Contributor

Hywan commented Mar 2, 2021

I agree with you that Close() must called manually rather than relying on the GC. I will try to reproduce by myself :-). Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
Kanban
  
🏁 Ready
Development

No branches or pull requests

2 participants