Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-O0 flag broken #195

Open
michel2323 opened this issue Feb 11, 2022 · 4 comments
Open

-O0 flag broken #195

michel2323 opened this issue Feb 11, 2022 · 4 comments
Labels
bug Something isn't working intrinsics upstream

Comments

@michel2323
Copy link
Collaborator

using AMDGPU
kernel() = (AMDGPU.trap(); nothing)
AMDGPU.@device_code_llvm wait(@roc kernel())

This runs fine with default julia flags but segfaults using no optimization -O0. @device_code_llvm output is the same in both cases:

# default flags and -O0
; CompilerJob of kernel kernel() for GPUCompiler.GCNCompilerTarget
;  @ /gpfs/alpine/csc359/scratch/mschanen/julia_depot_crusher/dev/AMDGPU/mwe.jl:4 within `kernel`
define amdgpu_kernel void @_Z16julia_kernel_968() local_unnamed_addr #1 {
entry:
; ┌ @ /gpfs/alpine/csc359/scratch/mschanen/julia_depot_crusher/dev/AMDGPU/src/device/llvm.jl:3 within `trap`
   call void @llvm.trap()
; └
  unreachable
}

This is on branch ms/julia-1.8 and

Julia Version 1.8.0-DEV.1490
Commit 94bbfd45f9 (2022-02-10 14:02 UTC)
Platform Info:
  OS: Linux (x86_64-suse-linux)
  CPU: AMD EPYC 7A53 64-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.0 (ORCJIT, znver3)
Environment:
  LD_LIBRARY_PATH = /opt/rocm-4.5.0/lib64:/opt/rocm-4.5.0/lib:/opt/rocm-4.5.0/hsa/lib:/opt/rocm-4.5.0/llvm/lib:/opt/cray/pe/papi/6.0.0.12/lib64:/opt/cray/libfabric/1.13.0.0/lib64
  JULIA_FOLDER = /gpfs/alpine/scratch/mschanen/csc359/julia
  JULIA_AMDGPU_DISABLE_ARTIFACTS = 1
  JULIA_DEPOT_PATH = /gpfs/alpine/scratch/mschanen/csc359/julia_depot_crusher
  JULIA_MPI_PATH = /opt/cray/pe/mpich/8.1.12/ofi/amd/4.4
Project AMDGPU v0.2.17
Status `/gpfs/alpine/csc359/scratch/mschanen/julia_depot_crusher/dev/AMDGPU/Project.toml`
  [621f4979] AbstractFFTs v1.1.0
  [79e6a3ab] Adapt v3.3.3
  [b99e7846] BinaryProvider v0.5.10
  [fa961155] CEnum v0.4.1
  [e2ba6199] ExprTools v0.1.8
  [0c68f7d7] GPUArrays v8.2.1
  [61eb1bfa] GPUCompiler v0.13.11
  [929cbde3] LLVM v4.7.1
  [1914dd2f] MacroTools v0.5.9
  [ae029012] Requires v1.3.0
⌅ [efcf1570] Setfield v0.7.1
  [276daf66] SpecialFunctions v2.1.2
  [2696aab5] HIP_jll v4.2.0+0
  [873c0968] ROCmDeviceLibs_jll v4.2.0+0
  [dd59ff1a] hsa_rocr_jll v4.2.0+1
  [a6151927] rocRAND_jll v4.2.0+0
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [44cfe95a] Pkg v1.8.0
  [de0858da] Printf
  [9a3f8284] Random
  [10745b16] Statistics
Info Packages marked with ⌅ have new versions available but cannot be upgraded. To see why use `status --outdated`
@jpsamaroo jpsamaroo added bug Something isn't working intrinsics labels Feb 11, 2022
@jpsamaroo
Copy link
Member

I can reproduce with 1.7.2 and jps/julia-1.7; the generated GCN seems fine, and running with --check-bounds=yes --inline=no doesn't produce any more meaningful info.

@jpsamaroo
Copy link
Member

This is due to TLS access in AMDGPU.queue_error_handler, which is eliminated on a non--O0 configuration. We can possibly prevent this function from reaching @cfunction with JuliaLang/julia#43747, but it won't fix the fact that we need to eliminate the TLS access.

@jpsamaroo
Copy link
Member

@vchuravy emission and non-elision of the "error in type bounds" check appears to be causing this.

@jpsamaroo
Copy link
Member

Nope, sorry, it's a typeassert

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working intrinsics upstream
Projects
None yet
Development

No branches or pull requests

2 participants