AugustGislerudRolfsen's picture
Model save
4a4c5b5 verified
metadata
library_name: peft
license: llama3.2
base_model: meta-llama/Llama-3.2-11B-Vision-Instruct
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: fine-tuned-visionllama_6
    results: []

Visualize in Weights & Biases

fine-tuned-visionllama_6

This model is a fine-tuned version of meta-llama/Llama-3.2-11B-Vision-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0262

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
6.2195 0.0184 5 2.9157
2.5279 0.0368 10 2.3537
2.3247 0.0552 15 2.2259
2.237 0.0736 20 2.1789
2.1879 0.0920 25 2.1520
2.1381 0.1104 30 2.1353
2.1285 0.1288 35 2.1204
2.1076 0.1472 40 2.1085
2.0869 0.1657 45 2.1010
2.0694 0.1841 50 2.0966
2.0917 0.2025 55 2.0899
2.0597 0.2209 60 2.0858
2.05 0.2393 65 2.0810
2.0764 0.2577 70 2.0784
2.0872 0.2761 75 2.0763
2.0388 0.2945 80 2.0734
2.057 0.3129 85 2.0704
2.0423 0.3313 90 2.0667
2.022 0.3497 95 2.0647
2.0281 0.3681 100 2.0631
2.0407 0.3865 105 2.0638
2.0284 0.4049 110 2.0617
2.0311 0.4233 115 2.0597
2.0093 0.4417 120 2.0578
2.0191 0.4601 125 2.0543
2.0316 0.4785 130 2.0539
2.0243 0.4970 135 2.0526
1.9983 0.5154 140 2.0520
2.0298 0.5338 145 2.0530
2.0217 0.5522 150 2.0511
2.0115 0.5706 155 2.0488
1.9883 0.5890 160 2.0481
2.0207 0.6074 165 2.0462
2.0069 0.6258 170 2.0453
2.0045 0.6442 175 2.0432
2.0034 0.6626 180 2.0435
1.9921 0.6810 185 2.0426
1.9912 0.6994 190 2.0419
1.9969 0.7178 195 2.0403
2.0093 0.7362 200 2.0391
2.0154 0.7546 205 2.0389
1.9934 0.7730 210 2.0380
1.9926 0.7914 215 2.0354
1.9771 0.8098 220 2.0352
1.9819 0.8283 225 2.0330
1.9779 0.8467 230 2.0333
1.9846 0.8651 235 2.0340
1.9913 0.8835 240 2.0335
1.9834 0.9019 245 2.0319
1.9786 0.9203 250 2.0312
1.9726 0.9387 255 2.0306
1.9793 0.9571 260 2.0293
1.971 0.9755 265 2.0298
1.973 0.9939 270 2.0298
1.9651 1.0123 275 2.0307
1.9619 1.0307 280 2.0308
1.9536 1.0491 285 2.0320
1.9618 1.0675 290 2.0327
1.9555 1.0859 295 2.0307
1.9704 1.1043 300 2.0294
1.9609 1.1227 305 2.0290
1.9745 1.1411 310 2.0302
1.9707 1.1596 315 2.0268
1.9651 1.1780 320 2.0279
1.9745 1.1964 325 2.0276
1.9618 1.2148 330 2.0267
1.932 1.2332 335 2.0248
1.9495 1.2516 340 2.0258
1.9396 1.2700 345 2.0262
1.9277 1.2884 350 2.0264
1.9355 1.3068 355 2.0273
1.9502 1.3252 360 2.0273
1.9491 1.3436 365 2.0281
1.9489 1.3620 370 2.0274
1.9194 1.3804 375 2.0271
1.9179 1.3988 380 2.0258
1.9418 1.4172 385 2.0261
1.9618 1.4356 390 2.0269
1.9283 1.4540 395 2.0256
1.912 1.4724 400 2.0225
1.9284 1.4909 405 2.0230
1.9418 1.5093 410 2.0223
1.9245 1.5277 415 2.0241
1.9292 1.5461 420 2.0237
1.9442 1.5645 425 2.0241
1.9366 1.5829 430 2.0225
1.9318 1.6013 435 2.0233
1.9266 1.6197 440 2.0234
1.9211 1.6381 445 2.0218
1.9248 1.6565 450 2.0230
1.9476 1.6749 455 2.0227
1.9333 1.6933 460 2.0206
1.9193 1.7117 465 2.0196
1.9291 1.7301 470 2.0231
1.9009 1.7485 475 2.0223
1.9134 1.7669 480 2.0225
1.9337 1.7853 485 2.0200
1.9077 1.8038 490 2.0227
1.8962 1.8222 495 2.0227
1.9343 1.8406 500 2.0221
1.9307 1.8590 505 2.0237
1.9339 1.8774 510 2.0220
1.922 1.8958 515 2.0220
1.9289 1.9142 520 2.0220
1.9269 1.9326 525 2.0231
1.9149 1.9510 530 2.0216
1.8962 1.9694 535 2.0252
1.9568 1.9878 540 2.0262

Framework versions

  • PEFT 0.13.0
  • Transformers 4.45.1
  • Pytorch 2.2.2+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.3