You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am trying to finetune llama on commonsense_170k. However, I find the when the loss value is around 0.6, it almost does not decrease. Is it normal?
For different dataset, the loss can be different. The loss seems normal, you can evaluate the performance of the trained model to check if the training works well.
Hi, I am trying to finetune llama on commonsense_170k. However, I find the when the loss value is around 0.6, it almost does not decrease. Is it normal?
WORLD_SIZE=2 CUDA_VISIBLE_DEVICES=1,2,3,4 torchrun --nproc_per_node=4 finetune.py --base_model 'yahma/llama-7b-hf' --data_path './LLM-Adapters/ft-training_set/commonsense_170k.json' --output_dir './trained_models/llama-sparselora-commonsense_new' --batch_size 16 --micro_batch_size 4 --num_epochs 3 --learning_rate 3e-4 --cutoff_len 256 --val_set_size 0 --adapter_name lora --lora_r=32 --lora_target_modules=["k_proj","q_proj","v_proj","down_proj","up_proj"] --lora_alpha=64
The text was updated successfully, but these errors were encountered: