paligemma_racer

This model is a fine-tuned version of google/paligemma-3b-pt-224 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.9411

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use adamw_hf with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss
13.5721	0.0209	50	7.2124
5.9601	0.0419	100	5.0726
4.731	0.0628	150	4.4045
4.2565	0.0837	200	4.0800
4.0418	0.1047	250	3.8702
3.866	0.1256	300	3.7370
3.6864	0.1465	350	3.5834
3.649	0.1675	400	3.5023
3.572	0.1884	450	3.4903
3.4765	0.2093	500	3.4307
3.406	0.2302	550	3.3801
3.3997	0.2512	600	3.3027
3.3602	0.2721	650	3.2871
3.2852	0.2930	700	3.2509
3.3183	0.3140	750	3.2354
3.3281	0.3349	800	3.2133
3.2545	0.3558	850	3.2098
3.3173	0.3768	900	3.1909
3.1993	0.3977	950	3.1646
3.1705	0.4186	1000	3.1401
3.1976	0.4396	1050	3.1217
3.1514	0.4605	1100	3.1340
3.1832	0.4814	1150	3.1282
3.1222	0.5024	1200	3.0997
3.1003	0.5233	1250	3.0788
3.0833	0.5442	1300	3.0735
3.099	0.5651	1350	3.0665
3.1295	0.5861	1400	3.0534
3.0962	0.6070	1450	3.0392
3.0589	0.6279	1500	3.0325
3.075	0.6489	1550	3.0311
3.034	0.6698	1600	3.0461
3.0333	0.6907	1650	3.0190
3.0494	0.7117	1700	3.0174
3.071	0.7326	1750	3.0123
3.0147	0.7535	1800	3.0020
3.0114	0.7745	1850	3.0074
3.0635	0.7954	1900	3.0224
2.9939	0.8163	1950	2.9942
3.0373	0.8373	2000	2.9888
2.998	0.8582	2050	2.9905
3.0004	0.8791	2100	2.9883
2.9477	0.9001	2150	2.9887
2.9837	0.9210	2200	2.9830
2.9501	0.9419	2250	2.9788
3.0235	0.9628	2300	2.9877
3.0083	0.9838	2350	2.9723
2.9368	1.0047	2400	2.9775
2.9975	1.0256	2450	2.9712
2.9089	1.0466	2500	2.9616
2.9285	1.0675	2550	2.9669
2.9627	1.0884	2600	2.9668
2.9195	1.1094	2650	2.9683
2.9319	1.1303	2700	2.9607
2.9009	1.1512	2750	2.9592
2.9486	1.1722	2800	2.9525
2.9416	1.1931	2850	2.9532
2.9223	1.2140	2900	2.9547
2.9257	1.2350	2950	2.9520
2.9182	1.2559	3000	2.9516
2.9255	1.2768	3050	2.9502
2.9113	1.2977	3100	2.9579
2.9165	1.3187	3150	2.9584
2.8901	1.3396	3200	2.9528
2.921	1.3605	3250	2.9470
2.9299	1.3815	3300	2.9481
2.9728	1.4024	3350	2.9458
2.919	1.4233	3400	2.9446
2.9132	1.4443	3450	2.9446
2.9178	1.4652	3500	2.9486
2.9293	1.4861	3550	2.9450
2.9514	1.5071	3600	2.9431
2.9099	1.5280	3650	2.9444
2.9292	1.5489	3700	2.9449
2.9336	1.5699	3750	2.9445
2.8772	1.5908	3800	2.9446
2.9389	1.6117	3850	2.9444
2.9618	1.6327	3900	2.9448
2.9721	1.6536	3950	2.9425
2.9052	1.6745	4000	2.9406
2.9245	1.6954	4050	2.9448
2.9196	1.7164	4100	2.9429
2.9622	1.7373	4150	2.9408
2.9199	1.7582	4200	2.9394
2.9114	1.7792	4250	2.9385
2.9548	1.8001	4300	2.9402
2.9263	1.8210	4350	2.9405
2.9079	1.8420	4400	2.9414
2.9144	1.8629	4450	2.9367
2.8985	1.8838	4500	2.9412
2.8942	1.9048	4550	2.9446
2.91	1.9257	4600	2.9424
2.8951	1.9466	4650	2.9414
2.9054	1.9676	4700	2.9411
2.8909	1.9885	4750	2.9411

Framework versions

Transformers 4.46.3
Pytorch 2.5.1
Datasets 3.1.0
Tokenizers 0.20.3

mateoguaman
/

paligemma_racer

paligemma_racer

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for mateoguaman/paligemma_racer

Evaluation results