--- license: mit base_model: gpt2-medium tags: - generated_from_keras_callback model-index: - name: sksayril/sayril-fino-345M-llm-baseModel results: [] --- # sksayril/sayril-fino-345M-llm-baseModel This model is a fine-tuned version of [gpt2-medium](https://huggingface.co/gpt2-medium) on an unknown dataset. It achieves the following results on the evaluation set: - Train Loss: 4.4500 - Validation Loss: 5.2187 - Epoch: 12 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01} - training_precision: float32 ### Training results | Train Loss | Validation Loss | Epoch | |:----------:|:---------------:|:-----:| | 7.1469 | 6.6384 | 0 | | 6.3490 | 6.2085 | 1 | | 5.9895 | 5.9744 | 2 | | 5.7461 | 5.8066 | 3 | | 5.5499 | 5.6763 | 4 | | 5.3800 | 5.5698 | 5 | | 5.2279 | 5.4891 | 6 | | 5.0860 | 5.4127 | 7 | | 4.9514 | 5.3552 | 8 | | 4.8221 | 5.3037 | 9 | | 4.6962 | 5.2694 | 10 | | 4.5722 | 5.2403 | 11 | | 4.4500 | 5.2187 | 12 | ### Framework versions - Transformers 4.34.0 - TensorFlow 2.13.0 - Datasets 2.14.5 - Tokenizers 0.14.1