Phi-2 gives me CUDA Out of Memory Error but Mistral7b works fine.

#12
by dpasch01 - opened

I don't get why the following code throws CUDA OOM error but when used with ehartford/dolphin-2.2.1-mistral-7b works just fine.

!autotrain llm --train \
    --project-name "phi-2-test" \
    --model "cognitivecomputations/dolphin-2_6-phi-2" \
    --data-path ./ \
    --train-split "train" \
    --valid-split "test" \
    --text_column "prompt" \
    --lr 2e-5 \
    --batch-size 2\
    --gradient-accumulation 5 \
    --epochs 2 \
    --merge_adapter \
    --model_max_length 512 \
    --trainer sft \
    --use-peft \
    --quantization int4 \
    --mixed-precision fp16 \
    --optimizer paged_adamw_32bit \
    --add_eos_token \
    --lora_r 16 \
    --lora_alpha 32 \
    --lora_dropout 0.05 \
    --target_modules "q_proj,k_proj,v_proj,dense,fc1,fc2"
Cognitive Computations org

You will need to ask this "autotrain" team, I don't know about that project.

ehartford changed discussion status to closed

Sign up or log in to comment