kz-transformers
/

kaz-roberta-conversational

Inference Endpoints

Model card Files Files and versions Community

kz-transformers commited on Apr 19, 2024

Commit

f9ddff8

·

verified ·

1 Parent(s): c7dabed

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -49,7 +49,8 @@ with `<s>` and the end of one by `</s>`
 ### Pretraining
-The model was trained on 2 V100 GPUs for 500K steps with a batch size of 128 and a sequence length of 512.
 ### Contributions

 ### Pretraining
+The model was trained on 2 V100 GPUs for 500K steps with a batch size of 128 and a sequence length of 512.  MLM probability - 15%, num_attention_heads=12,
+num_hidden_layers=6.
 ### Contributions