-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZLUDA suddenly crashes in Blender 3.6 Cycles on RX580 #221
Comments
Hi, I need additional information:
All from within the container |
I cant give an output for AMDGPU version because its an container on distrobox. rocminfo output
ROCm version: 5.7.1.50701-98~22.04 |
Ahhh, I though
|
cat: /sys/module/amdgpu/version: No such file or directory its probably because I did put in no-dkms parameter in the amdgpu installation because its in a container |
That's strange, I don't think you need amdgpu-dkms inside the container, just on host. At least on my machine it works without (ubuntu 22.04 on ubuntu 22.04). |
Thats why I containerized because I rather dont really want to install the amdgpu-dkms because so I can preserve it in my container and dont have to worry about version upgrades because i dont think rocm 5.7.1 will be updated to future kernel But you can probably provide your install process of rocm so I probably can test it out |
On the host I just followed the official instructions (https://rocm.docs.amd.com/en/docs-5.7.1/deploy/linux/os-native/install.html).
|
Can you use the latest amdgpu driver on the host and 5.7.1 on the container. Do they have to match? |
AFAIK version reported by amdgpu is the kernel version it has been backported from. In my case (on host):
|
I installed a more recent version of amdgpu dkms on host but on container with cat/sys/module/amdgpu and said: Is a directory In the container this is the output of apt show amdgpu-dkms:
On the host:
If I do cat /sys/module/amdgpu/version it outputs: 6.7.0 on host |
This is probably fine, tho I'd remove amdgpu-dkms in the container. The "Is a directory" error is probably because you skipped the space between "cat" and "/sys/module/amdgpu/version"? |
still crashes |
Does it happen with every scene? Can you try Blender test scenes from here: http://download.blender.org/demo/test/cycles_benchmark_20160228.zip and rendering e.g. benchmark/bmw27/bmw27_gpu.blend using command line like this with HIP debugging:
|
It still crashed but this the terminal log
|
Before ZLUDA worked fine on cycles rendering but now I get this error for some reason:
Memory access fault by GPU node-1 (Agent handle: 0x79e41932a000) on address (nil). Reason: Page not present or supervisor privilege.
I'm on Linux Ubuntu 22.04 using zluda in a ubuntu container
The text was updated successfully, but these errors were encountered: