about loss #65

haoyuwangwhy · 2024-04-29T18:12:09Z

Hi, I am trying to finetune llama on commonsense_170k. However, I find the when the loss value is around 0.6, it almost does not decrease. Is it normal?

WORLD_SIZE=2 CUDA_VISIBLE_DEVICES=1,2,3,4 torchrun --nproc_per_node=4 finetune.py --base_model 'yahma/llama-7b-hf' --data_path './LLM-Adapters/ft-training_set/commonsense_170k.json' --output_dir './trained_models/llama-sparselora-commonsense_new' --batch_size 16 --micro_batch_size 4 --num_epochs 3 --learning_rate 3e-4 --cutoff_len 256 --val_set_size 0 --adapter_name lora --lora_r=32 --lora_target_modules=["k_proj","q_proj","v_proj","down_proj","up_proj"] --lora_alpha=64

The text was updated successfully, but these errors were encountered:

HZQ950419 · 2024-05-07T03:48:46Z

Hi @haoyuwangwhy ,

For different dataset, the loss can be different. The loss seems normal, you can evaluate the performance of the trained model to check if the training works well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about loss #65

about loss #65

haoyuwangwhy commented Apr 29, 2024

HZQ950419 commented May 7, 2024

about loss #65

about loss #65

Comments

haoyuwangwhy commented Apr 29, 2024

HZQ950419 commented May 7, 2024