paligemma_racer_longer_wu_larger_bs

This model is a fine-tuned version of google/paligemma-3b-pt-224 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7550

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use adamw_hf with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
16.4798 0.0419 50 16.1631
14.3451 0.0837 100 11.3560
7.8988 0.1256 150 6.2607
5.8302 0.1674 200 5.3570
5.1658 0.2093 250 4.8183
4.7276 0.2512 300 4.4561
4.4283 0.2930 350 4.2387
4.2557 0.3349 400 4.0384
4.0452 0.3767 450 3.8548
3.8255 0.4186 500 3.7227
3.7081 0.4604 550 3.5824
3.5892 0.5023 600 3.4951
3.4664 0.5442 650 3.3667
3.4128 0.5860 700 3.2978
3.3171 0.6279 750 3.2273
3.253 0.6697 800 3.2083
3.1882 0.7116 850 3.1011
3.1445 0.7535 900 3.0567
3.1211 0.7953 950 3.0514
3.0509 0.8372 1000 3.0533
3.024 0.8790 1050 2.9892
2.9578 0.9209 1100 2.9652
2.9466 0.9627 1150 2.9208
2.8977 1.0046 1200 2.9276
2.8674 1.0465 1250 2.8737
2.838 1.0883 1300 2.8679
2.8106 1.1302 1350 2.8425
2.7897 1.1720 1400 2.8235
2.7793 1.2139 1450 2.8163
2.7553 1.2558 1500 2.8196
2.7579 1.2976 1550 2.8118
2.7189 1.3395 1600 2.7977
2.7381 1.3813 1650 2.8012
2.738 1.4232 1700 2.7779
2.7029 1.4650 1750 2.7757
2.7094 1.5069 1800 2.7749
2.6883 1.5488 1850 2.7701
2.6682 1.5906 1900 2.7634
2.7208 1.6325 1950 2.7659
2.6934 1.6743 2000 2.7587
2.6738 1.7162 2050 2.7607
2.6813 1.7581 2100 2.7605
2.6845 1.7999 2150 2.7589
2.6511 1.8418 2200 2.7560
2.6599 1.8836 2250 2.7565
2.6527 1.9255 2300 2.7541
2.6451 1.9674 2350 2.7550

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.5.1
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
20
Safetensors
Model size
2.92B params
Tensor type
BF16
·
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.

Model tree for mateoguaman/paligemma_racer_longer_wu_larger_bs

Finetuned
(42)
this model