Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WINO=1 python examples/beautiful_mnist.py has lower test_accuracy #3857

Open
chenyuxyz opened this issue Mar 21, 2024 · 4 comments
Open

WINO=1 python examples/beautiful_mnist.py has lower test_accuracy #3857

chenyuxyz opened this issue Mar 21, 2024 · 4 comments

Comments

@chenyuxyz
Copy link
Collaborator

I can hit RuntimeError: Error Domain=MTLCommandBufferErrorDomain Code=1 "Discarded (victim of GPU error/recovery) on M1 Max pretty consistently with WINO=1 PYTHONPATH=. python examples/beautiful_mnist.py. Seems fine with JIT=2.
Also with WINO=1 when it runs it only trains to 95%

@geohot
Copy link
Collaborator

geohot commented Mar 21, 2024

I don't see the crash on M3, but I do see the low accuracy.

@chenyuxyz chenyuxyz changed the title WINO=1 python examples/beautiful_mnist.py hangs on METAL WINO=1 python examples/beautiful_mnist.py has lower test_accuracy Mar 21, 2024
@chenyuxyz
Copy link
Collaborator Author

weird, rebooted and it does not crash. gpu was likely in a bad state.

update the issue for low accuracy

@michbogos
Copy link

I've also seen lower accuracy with IMAGE=1.

@chenyuxyz
Copy link
Collaborator Author

WINO and IMAGE both are affected by LAZYCACHE with JIT, and eval step is not properly captured by jit.

As a workaround either commenting out TinyJit of get_test_acc, or adding LAZYCACHE=0 works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants