File size: 4,756 Bytes
d2019d8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
license: other
base_model: Qwen/Qwen1.5-4B
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: squad_qa_baseline_v5_full_Qwen_Qwen1.5-4B_3e-5_lora
  results: []
library_name: peft
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# squad_qa_baseline_v5_full_Qwen_Qwen1.5-4B_3e-5_lora

This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 3.8632
- Accuracy: 0.5660

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 1
- eval_batch_size: 2
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- total_eval_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 50.0

### Training results

| Training Loss | Epoch   | Step | Validation Loss | Accuracy |
|:-------------:|:-------:|:----:|:---------------:|:--------:|
| No log        | 0.9916  | 74   | 2.0550          | 0.5952   |
| 2.3403        | 1.9966  | 149  | 2.0411          | 0.5933   |
| 2.0198        | 2.9883  | 223  | 2.0403          | 0.5932   |
| 2.0198        | 3.9933  | 298  | 2.0647          | 0.5922   |
| 1.9239        | 4.9983  | 373  | 2.0999          | 0.5921   |
| 1.7309        | 5.9899  | 447  | 2.1973          | 0.5879   |
| 1.5254        | 6.9950  | 522  | 2.2753          | 0.5861   |
| 1.5254        | 8.0     | 597  | 2.4079          | 0.5819   |
| 1.2937        | 8.9916  | 671  | 2.5096          | 0.5775   |
| 1.0409        | 9.9966  | 746  | 2.6079          | 0.5739   |
| 0.8766        | 10.9883 | 820  | 2.7579          | 0.5718   |
| 0.8766        | 11.9933 | 895  | 2.8722          | 0.5688   |
| 0.721         | 12.9983 | 970  | 2.9797          | 0.5672   |
| 0.6011        | 13.9899 | 1044 | 3.0708          | 0.5662   |
| 0.5455        | 14.9950 | 1119 | 3.1660          | 0.5648   |
| 0.5455        | 16.0    | 1194 | 3.2479          | 0.5650   |
| 0.5003        | 16.9916 | 1268 | 3.2445          | 0.5655   |
| 0.4683        | 17.9966 | 1343 | 3.2800          | 0.5638   |
| 0.457         | 18.9883 | 1417 | 3.4280          | 0.5640   |
| 0.457         | 19.9933 | 1492 | 3.4113          | 0.5662   |
| 0.4441        | 20.9983 | 1567 | 3.4731          | 0.5637   |
| 0.4327        | 21.9899 | 1641 | 3.5407          | 0.5639   |
| 0.4308        | 22.9950 | 1716 | 3.4811          | 0.5640   |
| 0.4308        | 24.0    | 1791 | 3.5854          | 0.5642   |
| 0.4245        | 24.9916 | 1865 | 3.5206          | 0.5640   |
| 0.416         | 25.9966 | 1940 | 3.6091          | 0.5638   |
| 0.4173        | 26.9883 | 2014 | 3.5707          | 0.5643   |
| 0.4173        | 27.9933 | 2089 | 3.6671          | 0.5648   |
| 0.4117        | 28.9983 | 2164 | 3.6267          | 0.5631   |
| 0.409         | 29.9899 | 2238 | 3.6658          | 0.5604   |
| 0.4085        | 30.9950 | 2313 | 3.6984          | 0.5621   |
| 0.4085        | 32.0    | 2388 | 3.6584          | 0.5660   |
| 0.403         | 32.9916 | 2462 | 3.5848          | 0.5626   |
| 0.404         | 33.9966 | 2537 | 3.6365          | 0.5631   |
| 0.4013        | 34.9883 | 2611 | 3.7047          | 0.5647   |
| 0.4013        | 35.9933 | 2686 | 3.7735          | 0.5643   |
| 0.3987        | 36.9983 | 2761 | 3.6867          | 0.5657   |
| 0.3951        | 37.9899 | 2835 | 3.7349          | 0.5662   |
| 0.3971        | 38.9950 | 2910 | 3.7173          | 0.5643   |
| 0.3971        | 40.0    | 2985 | 3.8004          | 0.5643   |
| 0.3939        | 40.9916 | 3059 | 3.8041          | 0.5636   |
| 0.3912        | 41.9966 | 3134 | 3.8263          | 0.5648   |
| 0.3941        | 42.9883 | 3208 | 3.7954          | 0.5646   |
| 0.3941        | 43.9933 | 3283 | 3.8001          | 0.5637   |
| 0.3878        | 44.9983 | 3358 | 3.8438          | 0.5634   |
| 0.3879        | 45.9899 | 3432 | 3.8626          | 0.5631   |
| 0.3907        | 46.9950 | 3507 | 3.7882          | 0.5645   |
| 0.3907        | 48.0    | 3582 | 3.8001          | 0.5622   |
| 0.3864        | 48.9916 | 3656 | 3.7201          | 0.5609   |
| 0.3871        | 49.5812 | 3700 | 3.8632          | 0.5660   |


### Framework versions

- PEFT 0.5.0
- Transformers 4.40.2
- Pytorch 2.3.0
- Datasets 2.19.1
- Tokenizers 0.19.1