Optimal Qlora settings #316

KnutJaegersberg · 2023-09-02T09:59:04Z

In HF transformers, the default setting of qlora does not replicate the qlora of the original paper, leaving valuable performance lying on the ML practitioners street using lib defaults.
One has to apply lora to certain parts of the NN, please see Tweet by Tim Dettmers:

https://twitter.com/Tim_Dettmers/status/1695377756232589459

I guess this has to be customized for each model architecture, sounds like a feature for curated-transformers, to me.

danieldk · 2023-09-05T20:31:48Z

Thanks for the suggestion! We hope to look more into training in the coming period and will definitely take this into account.

danieldk added type/feature Type: Feature feat/training Feature: Training/Fine-tuning labels Sep 5, 2023

shadeMe added this to the Undecided milestone Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimal Qlora settings #316

Optimal Qlora settings #316

KnutJaegersberg commented Sep 2, 2023

danieldk commented Sep 5, 2023

Optimal Qlora settings #316

Optimal Qlora settings #316

Comments

KnutJaegersberg commented Sep 2, 2023

danieldk commented Sep 5, 2023