Accessibility for LLM/SD(XL) LoRA Training Purposes? #3914

BuildBackBuehler · 2024-03-24T22:42:03Z

BuildBackBuehler
Mar 24, 2024

So I'm a huge dolt compared to those who have helped with Tinygrad -- I can manage to run scripts and programs, troubleshoot/debug 'em when things go awry. Basically I survive by seeing patterns & making inferences based on relationships of functions/code.

As to say, I don't intend/have the time to wrap my head around the math/inner mechanisms of Tinygrad/Torch/Transformers/etc. -- with that being said, would I still be able to manage to convert scripts, particularly for the needs mentioned in the title, or am I SOL? I figure SOL because I'm sure it won't be a matter of changing some imports and their usage in code but a ton of the intricacies within the functions.

I think I'd seen the Flash Metal Attention dev mention that y'all might use uh "bindings" (IIRC) - to semi-automate the conversion process but I didn't read anything about that in the docs/examples so I imagine it was a poor late night interpretation on my part.

If it wouldn't be too challenging, I do think a thorough tinygrad guide for dummies such as myself would be 3000x over useful!

Cheers
AA

https://github.com/abacaj/fine-tune-mistral/blob/main/train.py
https://github.com/huggingface/diffusers/blob/main/examples/advanced_diffusion_training/train_dreambooth_lora_sdxl_advanced.py

I guess another wonder of mine is if one can still use TG for its multi-arch. backends? What I mean is if I weren't injecting TG into this SDXL script, would I still be able to use "CUDA=1" and TG would do the legwork/magical CUDA-based work within the Pytorch framework -- whereas otherwise I'd be stuck with unaccelerated, CPU based calculations via Pytorch (if it even lets me, I get errors thanks to autocast or whatever Apple unfriendly module).

BuildBackBuehler · 2024-03-25T07:53:40Z

BuildBackBuehler
Mar 25, 2024
Author

Hm so as I dive deeper and deeper into this (and I mightttt be totally off here), it seems like autogen + some of the extras (like movement_ops + assembly + backend files) means that these conversions are a lot less manual than I'm imagining? Even then, still seems like this is all a bit above my paygrade

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accessibility for LLM/SD(XL) LoRA Training Purposes? #3914

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Accessibility for LLM/SD(XL) LoRA Training Purposes? #3914

BuildBackBuehler Mar 24, 2024

Replies: 1 comment

BuildBackBuehler Mar 25, 2024 Author

BuildBackBuehler
Mar 24, 2024

BuildBackBuehler
Mar 25, 2024
Author