You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
XLA GPU calls into cuDNN for fused multi-headed attention (for specific HLO patterns). When errors occur during the construction of a cuDNN graph a RuntimeError with minimal information is surfaced to the XLA GPU user. For example, using bfloat16 types can produce the following error message.
To be more helpful, the error message should state that the error happened during cuDNN graph construction / finalization and can be worked-around with --xla_gpu_enable_cudnn_fmha=false. Ideally, the error message would also include a serialized cuDNN graph to help debugging the issue.
The text was updated successfully, but these errors were encountered:
XLA GPU calls into cuDNN for fused multi-headed attention (for specific HLO patterns). When errors occur during the construction of a cuDNN graph a
RuntimeError
with minimal information is surfaced to the XLA GPU user. For example, usingbfloat16
types can produce the following error message.To be more helpful, the error message should state that the error happened during cuDNN graph construction / finalization and can be worked-around with
--xla_gpu_enable_cudnn_fmha=false
. Ideally, the error message would also include a serialized cuDNN graph to help debugging the issue.The text was updated successfully, but these errors were encountered: