tinygrad 0.6.0
2516 lines now. Some day I promise a release will make it smaller.
- float16 support (needed for LLaMA)
- Fixed critical bug in training BatchNorm
- Limited support for multiple GPUs
- ConvNeXt + several MLPerf models in models/
- More torch-like methods in tensor.py
- Big refactor of the codegen into the Linearizer and CStyle
- Removed CompiledBuffer, use the LazyBuffer ShapeTracker