Nayana-IR-colpali_v1_3-combined-15k-4bit-LoRA

This model is a fine-tuned version of vidore/colpaligemma-3b-pt-448-base on the Nayana-cognitivelab/Nayana-IR-DescVQA-finetune-hi-47k, Nayana-cognitivelab/Nayana-IR-DescVQA-finetune-kn-47k dataset. It achieves the following results on the evaluation set:

Loss: 0.2067
Model Preparation Time: 0.0054

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 1.5

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time
No log	0.0011	1	0.7311	0.0054
0.3759	0.1067	100	0.3940	0.0054
0.3167	0.2133	200	0.3363	0.0054
0.2865	0.32	300	0.2893	0.0054
0.2177	0.4267	400	0.2825	0.0054
0.2268	0.5333	500	0.2437	0.0054
0.2296	0.64	600	0.2280	0.0054
0.1723	0.7467	700	0.2354	0.0054
0.1138	0.8533	800	0.2218	0.0054
0.1929	0.96	900	0.2086	0.0054
0.1176	1.0661	1000	0.2076	0.0054
0.1426	1.1728	1100	0.2061	0.0054
0.1247	1.2795	1200	0.2101	0.0054
0.0976	1.3861	1300	0.2087	0.0054
0.1236	1.4928	1400	0.2066	0.0054

Framework versions

Transformers 4.47.1
Pytorch 2.6.0+cu124
Datasets 3.3.2
Tokenizers 0.21.0

Nayana-cognitivelab
/

Nayana-IR-colpali_v1_3-combined-15k-4bit-LoRA

Nayana-IR-colpali_v1_3-combined-15k-4bit-LoRA

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Nayana-cognitivelab/Nayana-IR-colpali_v1_3-combined-15k-4bit-LoRA

Evaluation results