---
library_name: transformers
tags:
- generated_from_trainer
model-index:
- name: train_3
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# train_3

This model was trained from scratch on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1641

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0006047816549758072
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.964129172421366,0.8471340191802936) and epsilon=1.51279024695782e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2593
- num_epochs: 500

### Training results

| Training Loss | Epoch | Step   | Validation Loss |
|:-------------:|:-----:|:------:|:---------------:|
| 0.2789        | 1.0   | 6351   | 0.1682          |
| 0.2849        | 2.0   | 12702  | 0.1695          |
| 0.2948        | 3.0   | 19053  | 0.1728          |
| 0.2806        | 4.0   | 25404  | 0.1641          |
| 0.2941        | 5.0   | 31755  | 0.1645          |
| 0.2926        | 6.0   | 38106  | 0.1533          |
| 0.2997        | 7.0   | 44457  | 0.1724          |
| 0.2867        | 8.0   | 50808  | 0.1663          |
| 0.2739        | 9.0   | 57159  | 0.1562          |
| 0.2884        | 10.0  | 63510  | 0.1708          |
| 0.306         | 11.0  | 69861  | 0.1569          |
| 0.2888        | 12.0  | 76212  | 0.1854          |
| 0.2933        | 13.0  | 82563  | 0.1756          |
| 0.2953        | 14.0  | 88914  | 0.1599          |
| 0.2987        | 15.0  | 95265  | 0.1701          |
| 0.3047        | 16.0  | 101616 | 0.2214          |
| 0.3126        | 17.0  | 107967 | 0.1564          |
| 0.3066        | 18.0  | 114318 | 0.2439          |
| 0.2861        | 19.0  | 120669 | 0.1590          |
| 0.3045        | 20.0  | 127020 | 0.3101          |
| 0.3045        | 21.0  | 133371 | 0.1641          |


### Framework versions

- Transformers 4.49.0
- Pytorch 2.6.0.dev20241217
- Datasets 2.20.0
- Tokenizers 0.21.0