-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xpu: provide a way to debug explicit CPU fallback #126488
Comments
Also filed intel/torch-xpu-ops#262 |
I still want to comment on that. I personally will be fine with muted logs on fallback by default because I know that currently there are a number of operations not yet implemented in XPU. However, I argue that for other people who just spotted and want to try XPU backend and having limited knowledge on it - for these people such muted behavior might be a problem. They will spot immediately that XPU backend significantly underperforms, sometimes even compared to CPU, and they won't have any obvious reason at hand why. Log messages with warnings that CPU fallback is happening were quote handy here - they were setting correct impression that currently XPU backend might underperform. My recommendation is to always print a debug message that CPU fallback is happening regardless whether it's explicit (whatever this means) or implicit. |
Got your requirement. In my understanding, the log is not informative for DL workload customers. It should be a debugging requirement. As to release build, we would keep existing implementation. I think, we could add the feature in debug build. |
@EikanWang Please comment. |
Can you, please, have this feature controlled by environment variable, let's say same as before - |
I opened intel/torch-xpu-ops#318 with the implementation I propose (which is - always warn on cpu fallback :) ). Let's continue discussion in the PR. |
We will close the issue as long as the PR is landed. |
@fengyuan14 - The commit intel/torch-xpu-ops@5bf9e0c muted debug logs of "explicit" CPU fallbacks. This complicated debug for 3d party contributors trying to evaluate XPU backend capabilities - now I am forced to revert noted commit to understand which operations are not currently implemented by XPU. Please:
PYTORCH_DEBUG_XPU_FALLBACK=1
to track any CPU fallback happening in XPU backend. Note: I am fine if "explicit" fallback will be muted by default, but I really need a way to be able to track it.CC: @jgong5 @mingfeima @XiaobingSuper @ashokei @jingxu10 @gujinghui @EikanWang @fengyuan14 @guangyey
cc @gujinghui @EikanWang @fengyuan14 @guangyey
The text was updated successfully, but these errors were encountered: