antphb
/

pretrain-gpt2-large

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

antphb commited on Sep 21, 2023

Commit

ae856a2

·

1 Parent(s): 5b32365

End of training

Files changed (2) hide show

README.md +18 -6
pytorch_model.bin +1 -1

README.md CHANGED Viewed

@@ -13,6 +13,8 @@ should probably proofread and complete it, then remove this comment. -->
 # pretrain-gpt2-large
 This model is a fine-tuned version of [NlpHUST/gpt2-vietnamese](https://huggingface.co/NlpHUST/gpt2-vietnamese) on the None dataset.
 ## Model description
@@ -31,18 +33,28 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 32
-- eval_batch_size: 2
 - seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 1
 ### Training results
 ### Framework versions

 # pretrain-gpt2-large
 This model is a fine-tuned version of [NlpHUST/gpt2-vietnamese](https://huggingface.co/NlpHUST/gpt2-vietnamese) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.4155
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 16
+- eval_batch_size: 1
 - seed: 42
+- distributed_type: multi-GPU
+- num_devices: 2
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 512
+- total_eval_batch_size: 2
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 70
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 3.0875        | 13.05 | 500  | 2.6828          |
+| 2.5739        | 26.1  | 1000 | 2.5363          |
+| 2.4573        | 39.15 | 1500 | 2.4643          |
+| 2.3962        | 52.2  | 2000 | 2.4294          |
+| 2.3662        | 65.25 | 2500 | 2.4155          |
 ### Framework versions

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:11724b533766e7cf889fe3fc5012d1c79fbf1d9b5673224566e750e1c0c66319
 size 497810269

 version https://git-lfs.github.com/spec/v1
+oid sha256:c99acace5ad9fef6677e3dd7ecec39f0577442e47b5a70f14c2bd5471bb0c6fb
 size 497810269