Reproducing sentiment finetuning train_lora extremely slow #146

vikigenius · 2024-01-05T23:57:01Z

I am trying to reproduce the finetuning for the fingpt-sentiment_llama2-13b_lora

The table claims we can do this in just a single RTX 3090 within a day.
I am using a L4 GPU instead.

I downloaded the models to base_models and the dataset to data correctly

I used the script like this

deepspeed -i train_lora.py \
--run_name sentiment-llama2-13b-20epoch-64batch \
--base_model llama2-13b-nr \
--dataset sentiment-train \
--max_length 512 \
--batch_size 16 \
--learning_rate 1e-4 \
--num_epochs 20 \

I got an OOM.

So i set the load_in_8_bit=True

But I am getting extremely slow fine tuning speed A single epoch is estimated to take 2 days.

The text was updated successfully, but these errors were encountered:

ynjiun · 2024-02-13T00:34:22Z

two things you might want to consider to speed up:

--base_model llama2-13b-nr => --base_model llama2-7b-nr
use rtx 3090 which is faster than L4 in in memory bandwidth and more cores

zhumingpassional added bug Something isn't working help wanted Extra attention is needed and removed bug Something isn't working labels Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing sentiment finetuning train_lora extremely slow #146

Reproducing sentiment finetuning train_lora extremely slow #146

vikigenius commented Jan 5, 2024

ynjiun commented Feb 13, 2024

Reproducing sentiment finetuning train_lora extremely slow #146

Reproducing sentiment finetuning train_lora extremely slow #146

Comments

vikigenius commented Jan 5, 2024

ynjiun commented Feb 13, 2024