How to conduct full training and switch to 13B for training? #6

Mryangkaitong · 2024-04-08T13:39:50Z

Excellent work！！！
If I want to conduct full parameter training (non-lora) on llama2 13B now, where should I modify the code in stage1-stage3 to achieve the following two things:

(1) Change the base to 13B

(2) Full parameter training

thanks

KerolosAtef · 2024-04-25T08:36:44Z

for llama 2 13B : change the llama model path in the training config file for each stage.
to turn off lora and use full parameter training :
comment the LORA setting in minigpt4/models/mini_gpt4_llama_v2.py

        loraconfig = LoraConfig(
            r=lora_r,
            lora_alpha=lora_alpha,
            target_modules=lora_target_modules,
            lora_dropout=lora_dropout,
            bias="none",
            task_type="CAUSAL_LM"
        )
        self.llama_model = get_peft_model(self.llama_model, loraconfig)

        self.llama_model.print_trainable_parameters()
        ```
        
        For CUDA memory while training on A100 with maximum batch_size=4
        ```
       self.llama_model = prepare_model_for_int8_training(self.llama_model)

   but you can also comment this line (if needed), but take care about the CUDA memory (it is not working for me even with batch size=1)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to conduct full training and switch to 13B for training? #6

How to conduct full training and switch to 13B for training? #6

Mryangkaitong commented Apr 8, 2024

KerolosAtef commented Apr 25, 2024 •

edited

How to conduct full training and switch to 13B for training? #6

How to conduct full training and switch to 13B for training? #6

Comments

Mryangkaitong commented Apr 8, 2024

KerolosAtef commented Apr 25, 2024 • edited

KerolosAtef commented Apr 25, 2024 •

edited