Skip to content

Commit

Permalink
fix model_path and batch_size for sparse case
Browse files Browse the repository at this point in the history
  • Loading branch information
Shubhra Pandit committed Apr 11, 2024
1 parent 965b31a commit c3f7600
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
4 changes: 2 additions & 2 deletions docs/llms/guides/sparse-finetuning-llm-gsm8k-with-sparseml.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ accelerate launch \
--learning_rate 0.00005 \
--lr_scheduler_type "linear" \
--max_seq_length 1024 \
--per_device_train_batch_size 32 \
--per_device_train_batch_size 16 \
--max_grad_norm None \
--warmup_steps 20 \
--distill_teacher PATH_TO_TEACHER \
Expand Down Expand Up @@ -331,7 +331,7 @@ MODEL_PATH=<MODEL_PATH>
TASK=gsm8k
python main.py \
--model sparseml \
--model_args pretrained=MODEL_PATH,trust_remote_code=True \
--model_args pretrained=${MODEL_PATH},trust_remote_code=True \
--tasks $TASK \
--batch_size 48 \
--no_cache \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ accelerate launch \
--learning_rate 0.00005 \
--lr_scheduler_type "linear" \
--max_seq_length 1024 \
--per_device_train_batch_size 32 \
--per_device_train_batch_size 16 \
--max_grad_norm None \
--warmup_steps 20 \
--distill_teacher PATH_TO_TEACHER \
Expand Down Expand Up @@ -331,7 +331,7 @@ MODEL_PATH=<MODEL_PATH>
TASK=gsm8k
python main.py \
--model sparseml \
--model_args pretrained=MODEL_PATH,trust_remote_code=True \
--model_args pretrained=${MODEL_PATH},trust_remote_code=True \
--tasks $TASK \
--batch_size 48 \
--no_cache \
Expand Down

0 comments on commit c3f7600

Please sign in to comment.