juliekallini
commited on
Commit
•
6ae7558
1
Parent(s):
5d04b9f
Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,10 @@ This is one model in a collection of models trained on the impossible
|
|
12 |
languages of [Kallini et al. 2024](https://arxiv.org/abs/2401.06416).
|
13 |
|
14 |
This model is a GPT-2 Small model trained from scratch on the *NoShuffle*
|
15 |
-
language.
|
|
|
|
|
|
|
16 |
|
17 |
![languages.png](https://cdn-uploads.huggingface.co/production/uploads/6268bc06adb1c6525b3d5157/pBt38YYQL1gj8DqjyorWS.png)
|
18 |
|
@@ -55,6 +58,15 @@ generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
|
|
55 |
print(generated_text)
|
56 |
```
|
57 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
58 |
## Training Details
|
59 |
|
60 |
### Training Data
|
|
|
12 |
languages of [Kallini et al. 2024](https://arxiv.org/abs/2401.06416).
|
13 |
|
14 |
This model is a GPT-2 Small model trained from scratch on the *NoShuffle*
|
15 |
+
language. We include a total of 30 checkpoints over the course of
|
16 |
+
model training, from step 100 to 3000 in increments of 100 steps.
|
17 |
+
The main branch contains the final checkpoint (3000), and the other
|
18 |
+
checkpoints are accessible as revisions.
|
19 |
|
20 |
![languages.png](https://cdn-uploads.huggingface.co/production/uploads/6268bc06adb1c6525b3d5157/pBt38YYQL1gj8DqjyorWS.png)
|
21 |
|
|
|
58 |
print(generated_text)
|
59 |
```
|
60 |
|
61 |
+
By default, the `main` branch of this model repo loads the
|
62 |
+
last model checkpoint (3000). To access the other checkpoints,
|
63 |
+
use the `revision` argument:
|
64 |
+
|
65 |
+
```
|
66 |
+
model = GPT2LMHeadModel.from_pretrained(model_id, revision="checkpoint-500")
|
67 |
+
```
|
68 |
+
This loads the model at checkpoint 500.
|
69 |
+
|
70 |
## Training Details
|
71 |
|
72 |
### Training Data
|