Accessibility for LLM/SD(XL) LoRA Training Purposes? #3914
BuildBackBuehler
started this conversation in
General
Replies: 1 comment
-
Hm so as I dive deeper and deeper into this (and I mightttt be totally off here), it seems like autogen + some of the extras (like movement_ops + assembly + backend files) means that these conversions are a lot less manual than I'm imagining? Even then, still seems like this is all a bit above my paygrade |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
So I'm a huge dolt compared to those who have helped with Tinygrad -- I can manage to run scripts and programs, troubleshoot/debug 'em when things go awry. Basically I survive by seeing patterns & making inferences based on relationships of functions/code.
As to say, I don't intend/have the time to wrap my head around the math/inner mechanisms of Tinygrad/Torch/Transformers/etc. -- with that being said, would I still be able to manage to convert scripts, particularly for the needs mentioned in the title, or am I SOL? I figure SOL because I'm sure it won't be a matter of changing some imports and their usage in code but a ton of the intricacies within the functions.
I think I'd seen the Flash Metal Attention dev mention that y'all might use uh "bindings" (IIRC) - to semi-automate the conversion process but I didn't read anything about that in the docs/examples so I imagine it was a poor late night interpretation on my part.
If it wouldn't be too challenging, I do think a thorough tinygrad guide for dummies such as myself would be 3000x over useful!
Cheers
AA
https://github.com/abacaj/fine-tune-mistral/blob/main/train.py
https://github.com/huggingface/diffusers/blob/main/examples/advanced_diffusion_training/train_dreambooth_lora_sdxl_advanced.py
I guess another wonder of mine is if one can still use TG for its multi-arch. backends? What I mean is if I weren't injecting TG into this SDXL script, would I still be able to use "CUDA=1" and TG would do the legwork/magical CUDA-based work within the Pytorch framework -- whereas otherwise I'd be stuck with unaccelerated, CPU based calculations via Pytorch (if it even lets me, I get errors thanks to autocast or whatever Apple unfriendly module).
Beta Was this translation helpful? Give feedback.
All reactions