Nbeau
/

GPT2-arithmetic-1-3digits

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Nbeau commited on May 23, 2024

Commit

c98442d

·

verified ·

1 Parent(s): 17e9c6a

End of training

Files changed (2) hide show

README.md +8 -6
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -42,18 +42,20 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant_with_warmup
-- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 0.1931        | 0.9984 | 156  | nan             |
-| 0.093         | 1.9968 | 312  | nan             |
-| 0.0822        | 2.9952 | 468  | nan             |
-| 0.0784        | 4.0    | 625  | nan             |
-| 0.0775        | 4.992  | 780  | nan             |
 ### Framework versions

 - total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant_with_warmup
+- num_epochs: 7
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 0.1482        | 0.9984 | 156  | nan             |
+| 0.0902        | 1.9968 | 312  | nan             |
+| 0.0804        | 2.9952 | 468  | nan             |
+| 0.0789        | 4.0    | 625  | nan             |
+| 0.0798        | 4.9984 | 781  | nan             |
+| 0.0776        | 5.9968 | 937  | nan             |
+| 0.0773        | 6.9888 | 1092 | nan             |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:af0295fce5c470f92fe25bcb804dae0dd49b4e104a4ba4c42cba156164090603
 size 441691776

 version https://git-lfs.github.com/spec/v1
+oid sha256:82354fe265214c419b9a3b39f7386563e686183bde97eb5a1e2fc32af01f5096
 size 441691776