diogovelho's picture
Training in progress, step 40
fb80890
|
raw
history blame
4.31 kB
metadata
license: other
base_model: HuggingFaceM4/idefics-9b
tags:
  - generated_from_trainer
model-index:
  - name: idefics-9b-dresses-gpt4
    results: []
library_name: peft

idefics-9b-dresses-gpt4

This model is a fine-tuned version of HuggingFaceM4/idefics-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: nan

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

The following bitsandbytes quantization config was used during training:

  • quant_method: bitsandbytes
  • load_in_8bit: False
  • load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: ['lm_head', 'embed_tokens']
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: True
  • bnb_4bit_compute_dtype: float16

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
0.0 0.03 20 nan
0.0 0.06 40 nan
0.0 0.09 60 nan
0.0 0.12 80 nan
0.0 0.14 100 nan
0.0 0.17 120 nan
0.0 0.2 140 nan
0.0 0.23 160 nan
0.0 0.26 180 nan
0.0 0.29 200 nan
0.0 0.32 220 nan
0.0 0.35 240 nan
0.0 0.38 260 nan
0.0 0.41 280 nan
0.0 0.43 300 nan
0.0 0.46 320 nan
0.0 0.49 340 nan
0.0 0.52 360 nan
0.0 0.55 380 nan
0.0 0.58 400 nan
0.0 0.61 420 nan
0.0 0.64 440 nan
0.0 0.67 460 nan
0.0 0.7 480 nan
0.0 0.72 500 nan
0.0 0.75 520 nan
0.0 0.78 540 nan
0.0 0.81 560 nan
0.0 0.84 580 nan
0.0 0.87 600 nan
0.0 0.9 620 nan
0.0 0.93 640 nan
0.0 0.96 660 nan
0.0 0.99 680 nan
0.0 1.01 700 nan
0.0 1.04 720 nan
0.0 1.07 740 nan
0.0 1.1 760 nan
0.0 1.13 780 nan
0.0 1.16 800 nan
0.0 1.19 820 nan
0.0 1.22 840 nan
0.0 1.25 860 nan
0.0 1.28 880 nan
0.0 1.3 900 nan
0.0 1.33 920 nan
0.0 1.36 940 nan
0.0 1.39 960 nan
0.0 1.42 980 nan
0.0 1.45 1000 nan

Framework versions

  • PEFT 0.6.0.dev0
  • Transformers 4.32.1
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.13.3