Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent bifrost.linalg test failures #187

Open
jaycedowell opened this issue Oct 17, 2022 · 2 comments
Open

Intermittent bifrost.linalg test failures #187

jaycedowell opened this issue Oct 17, 2022 · 2 comments
Labels

Comments

@jaycedowell
Copy link
Collaborator

Occasionally we see test failures on the self-hosted bifrost.linalg suite. Now that I'm looking for one to point to I cannot find one.

@jaycedowell jaycedowell changed the title Intermitten bifrost.linalg test failures Intermittent bifrost.linalg test failures Oct 17, 2022
@jaycedowell
Copy link
Collaborator Author

Here's one: #167 (comment)

@jaycedowell
Copy link
Collaborator Author

jaycedowell commented Jun 16, 2023

I wonder if this is somehow related to #210. The only places where BF_STATUS_UNSUPPORTED_SHAPE can be thrown from a LinAlg call are in linalg_kernels.cu:

  • bf_cherk_N
  • bf_cgemm_TN_smallM_staticN_v2
  • bf_cgemm_TN_smallM

These are all kind of trivial though. It's mostly value checking for the matrix shape. There are a couple of comparisons of the batch size with the texture memory size that can also throw this. It would be nice to know exactly which BF_STATUS_UNSUPPORTED_SHAPE we are hitting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant