Skip to content

tinygrad 0.8.0

Latest
Compare
Choose a tag to compare
@geohot geohot released this 09 Jan 18:16
· 1065 commits to master since this release
2c6f2e8

Close to the new limit of 5000 lines at 4981.

Release Highlights

  • Real dtype support within kernels!
  • New .schedule() API to separate concerns of scheduling and running
  • New lazy.py implementation doesn't reorder at build time. GRAPH=1 is usable to debug issues
  • 95 TFLOP FP16->FP32 matmuls on 7900XTX
  • GPT2 runs (jitted) in 2 ms on NVIDIA 3090
  • Powerful and fast kernel beam search with BEAM=2
  • GPU/CUDA/HIP backends switched to gpuctypes
  • New (alpha) multigpu sharding API with .shard

See the full changelog: v0.7.0...v0.8.0

Join the Discord!