Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuDNN fMHA failures return unhelpful error messages #11129

Open
thomasjoerg opened this issue Apr 2, 2024 · 0 comments
Open

cuDNN fMHA failures return unhelpful error messages #11129

thomasjoerg opened this issue Apr 2, 2024 · 0 comments
Labels
NVIDIA-GPU XLA on Nvidia GPU

Comments

@thomasjoerg
Copy link
Member

XLA GPU calls into cuDNN for fused multi-headed attention (for specific HLO patterns). When errors occur during the construction of a cuDNN graph a RuntimeError with minimal information is surfaced to the XLA GPU user. For example, using bfloat16 types can produce the following error message.

RuntimeError: CUDNN_BACKEND_OPERATION: cudnnFinalize Failed cudnn_status: CUDNN_STATUS_BAD_PARAM

To be more helpful, the error message should state that the error happened during cuDNN graph construction / finalization and can be worked-around with --xla_gpu_enable_cudnn_fmha=false. Ideally, the error message would also include a serialized cuDNN graph to help debugging the issue.

@jprabhas jprabhas added the NVIDIA-GPU XLA on Nvidia GPU label Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NVIDIA-GPU XLA on Nvidia GPU
Projects
None yet
Development

No branches or pull requests

2 participants