metadata

license: other
base_model: Qwen/Qwen1.5-4B
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: squad_qa_baseline_v5_full_Qwen_Qwen1.5-4B_3e-5_lora
    results: []
library_name: peft

squad_qa_baseline_v5_full_Qwen_Qwen1.5-4B_3e-5_lora

This model is a fine-tuned version of Qwen/Qwen1.5-4B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.8632
Accuracy: 0.5660

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 1
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 8
total_train_batch_size: 32
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.05
num_epochs: 50.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	0.9916	74	2.0550	0.5952
2.3403	1.9966	149	2.0411	0.5933
2.0198	2.9883	223	2.0403	0.5932
2.0198	3.9933	298	2.0647	0.5922
1.9239	4.9983	373	2.0999	0.5921
1.7309	5.9899	447	2.1973	0.5879
1.5254	6.9950	522	2.2753	0.5861
1.5254	8.0	597	2.4079	0.5819
1.2937	8.9916	671	2.5096	0.5775
1.0409	9.9966	746	2.6079	0.5739
0.8766	10.9883	820	2.7579	0.5718
0.8766	11.9933	895	2.8722	0.5688
0.721	12.9983	970	2.9797	0.5672
0.6011	13.9899	1044	3.0708	0.5662
0.5455	14.9950	1119	3.1660	0.5648
0.5455	16.0	1194	3.2479	0.5650
0.5003	16.9916	1268	3.2445	0.5655
0.4683	17.9966	1343	3.2800	0.5638
0.457	18.9883	1417	3.4280	0.5640
0.457	19.9933	1492	3.4113	0.5662
0.4441	20.9983	1567	3.4731	0.5637
0.4327	21.9899	1641	3.5407	0.5639
0.4308	22.9950	1716	3.4811	0.5640
0.4308	24.0	1791	3.5854	0.5642
0.4245	24.9916	1865	3.5206	0.5640
0.416	25.9966	1940	3.6091	0.5638
0.4173	26.9883	2014	3.5707	0.5643
0.4173	27.9933	2089	3.6671	0.5648
0.4117	28.9983	2164	3.6267	0.5631
0.409	29.9899	2238	3.6658	0.5604
0.4085	30.9950	2313	3.6984	0.5621
0.4085	32.0	2388	3.6584	0.5660
0.403	32.9916	2462	3.5848	0.5626
0.404	33.9966	2537	3.6365	0.5631
0.4013	34.9883	2611	3.7047	0.5647
0.4013	35.9933	2686	3.7735	0.5643
0.3987	36.9983	2761	3.6867	0.5657
0.3951	37.9899	2835	3.7349	0.5662
0.3971	38.9950	2910	3.7173	0.5643
0.3971	40.0	2985	3.8004	0.5643
0.3939	40.9916	3059	3.8041	0.5636
0.3912	41.9966	3134	3.8263	0.5648
0.3941	42.9883	3208	3.7954	0.5646
0.3941	43.9933	3283	3.8001	0.5637
0.3878	44.9983	3358	3.8438	0.5634
0.3879	45.9899	3432	3.8626	0.5631
0.3907	46.9950	3507	3.7882	0.5645
0.3907	48.0	3582	3.8001	0.5622
0.3864	48.9916	3656	3.7201	0.5609
0.3871	49.5812	3700	3.8632	0.5660

Framework versions

PEFT 0.5.0
Transformers 4.40.2
Pytorch 2.3.0
Datasets 2.19.1
Tokenizers 0.19.1