sizhkhy's picture
Upload folder using huggingface_hub
475c944 verified
|
raw
history blame
8.22 kB
metadata
library_name: peft
license: other
base_model: unsloth/Llama-3.2-3B-Instruct
tags:
  - llama-factory
  - lora
  - unsloth
  - generated_from_trainer
model-index:
  - name: llm3br256
    results: []

llm3br256

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the dbischof_premise_aea dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.0896 0.0387 5 0.0767
0.057 0.0774 10 0.0397
0.0361 0.1162 15 0.0325
0.0478 0.1549 20 0.0304
0.0293 0.1936 25 0.0270
0.0429 0.2323 30 0.0253
0.0368 0.2711 35 0.0244
0.0323 0.3098 40 0.0229
0.0223 0.3485 45 0.0225
0.0327 0.3872 50 0.0216
0.0237 0.4259 55 0.0209
0.0255 0.4647 60 0.0204
0.0237 0.5034 65 0.0197
0.0273 0.5421 70 0.0197
0.0192 0.5808 75 0.0192
0.0459 0.6196 80 0.0188
0.0203 0.6583 85 0.0185
0.032 0.6970 90 0.0183
0.0145 0.7357 95 0.0184
0.0299 0.7744 100 0.0181
0.0186 0.8132 105 0.0183
0.0255 0.8519 110 0.0178
0.0199 0.8906 115 0.0177
0.0216 0.9293 120 0.0173
0.024 0.9681 125 0.0176
0.0319 1.0068 130 0.0173
0.0202 1.0455 135 0.0176
0.0167 1.0842 140 0.0171
0.0205 1.1229 145 0.0168
0.0164 1.1617 150 0.0167
0.0303 1.2004 155 0.0168
0.0201 1.2391 160 0.0165
0.0183 1.2778 165 0.0164
0.0221 1.3166 170 0.0163
0.0132 1.3553 175 0.0162
0.0226 1.3940 180 0.0158
0.0173 1.4327 185 0.0159
0.0304 1.4714 190 0.0164
0.0177 1.5102 195 0.0161
0.0155 1.5489 200 0.0160
0.0258 1.5876 205 0.0159
0.0217 1.6263 210 0.0163
0.0197 1.6651 215 0.0161
0.0124 1.7038 220 0.0158
0.0248 1.7425 225 0.0156
0.017 1.7812 230 0.0159
0.0248 1.8199 235 0.0158
0.0189 1.8587 240 0.0155
0.0185 1.8974 245 0.0151
0.0154 1.9361 250 0.0151
0.0223 1.9748 255 0.0152
0.0161 2.0136 260 0.0152
0.0139 2.0523 265 0.0154
0.0173 2.0910 270 0.0153
0.0237 2.1297 275 0.0152
0.0167 2.1684 280 0.0151
0.0086 2.2072 285 0.0149
0.012 2.2459 290 0.0147
0.015 2.2846 295 0.0149
0.0165 2.3233 300 0.0151
0.0183 2.3621 305 0.0150
0.0233 2.4008 310 0.0151
0.0163 2.4395 315 0.0149
0.0121 2.4782 320 0.0147
0.0213 2.5169 325 0.0145
0.0253 2.5557 330 0.0145
0.023 2.5944 335 0.0149
0.014 2.6331 340 0.0144
0.0156 2.6718 345 0.0145
0.0164 2.7106 350 0.0143
0.0262 2.7493 355 0.0140
0.0134 2.7880 360 0.0142
0.018 2.8267 365 0.0144
0.0166 2.8654 370 0.0145
0.0204 2.9042 375 0.0141
0.0284 2.9429 380 0.0139
0.021 2.9816 385 0.0139
0.0125 3.0203 390 0.0145
0.0157 3.0591 395 0.0145
0.0136 3.0978 400 0.0142
0.0087 3.1365 405 0.0141
0.0217 3.1752 410 0.0139
0.0125 3.2139 415 0.0136
0.0115 3.2527 420 0.0138
0.0128 3.2914 425 0.0139
0.0278 3.3301 430 0.0138
0.0197 3.3688 435 0.0136
0.0095 3.4076 440 0.0133
0.0075 3.4463 445 0.0133
0.0112 3.4850 450 0.0136
0.0129 3.5237 455 0.0137
0.011 3.5624 460 0.0136
0.0233 3.6012 465 0.0136
0.0132 3.6399 470 0.0134
0.0147 3.6786 475 0.0136
0.0073 3.7173 480 0.0136
0.0143 3.7561 485 0.0136
0.0086 3.7948 490 0.0137
0.0055 3.8335 495 0.0138
0.0108 3.8722 500 0.0138
0.0079 3.9109 505 0.0136
0.0105 3.9497 510 0.0133
0.0117 3.9884 515 0.0133
0.008 4.0271 520 0.0135
0.0147 4.0658 525 0.0137
0.007 4.1045 530 0.0143
0.0059 4.1433 535 0.0146
0.015 4.1820 540 0.0144
0.0121 4.2207 545 0.0142
0.0113 4.2594 550 0.0140
0.0068 4.2982 555 0.0140
0.0095 4.3369 560 0.0140
0.0149 4.3756 565 0.0141
0.0063 4.4143 570 0.0141
0.0073 4.4530 575 0.0141
0.0114 4.4918 580 0.0142
0.0064 4.5305 585 0.0142
0.011 4.5692 590 0.0142
0.0088 4.6079 595 0.0142
0.0049 4.6467 600 0.0142
0.0079 4.6854 605 0.0142
0.0061 4.7241 610 0.0142
0.012 4.7628 615 0.0142
0.0107 4.8015 620 0.0142
0.0104 4.8403 625 0.0142
0.0117 4.8790 630 0.0142
0.013 4.9177 635 0.0142
0.0079 4.9564 640 0.0142
0.0082 4.9952 645 0.0142

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.4.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3