Update README.md
Browse files
README.md
CHANGED
@@ -28,11 +28,11 @@ You can use this model directly with a pipeline for text generation.
|
|
28 |
>>> set_seed(5)
|
29 |
>>> generator("昨日私は京都で", max_length=30, do_sample=True, num_return_sequences=5)
|
30 |
|
31 |
-
[{'generated_text': '
|
32 |
-
{'generated_text': '
|
33 |
-
{'generated_text': '
|
34 |
-
{'generated_text': '
|
35 |
-
{'generated_text': '
|
36 |
```
|
37 |
|
38 |
You can also use this model to get the features of a given text.
|
@@ -67,7 +67,7 @@ The following hyperparameters were used during pre-training:
|
|
67 |
- weight_decay: 0.01
|
68 |
- lr_scheduler_type: linear
|
69 |
- max_grad_norm: 1.0
|
70 |
-
- max_steps: 500,000 (but terminated at
|
71 |
- warmup_steps: 10,000
|
72 |
|
73 |
-
The eval loss was 1.
|
|
|
28 |
>>> set_seed(5)
|
29 |
>>> generator("昨日私は京都で", max_length=30, do_sample=True, num_return_sequences=5)
|
30 |
|
31 |
+
[{'generated_text': '昨日私は京都であの日に、あんなに頑張ったのに…と思った。私は'},
|
32 |
+
{'generated_text': '昨日私は京都で開かれた大阪市内で会場見学をしました。そしてそ'},
|
33 |
+
{'generated_text': '昨日私は京都で行われました。その時はまだ若手が多数入学して何'},
|
34 |
+
{'generated_text': '昨日私は京都では雪が解けるまで寝た様子があります・・・(;´'},
|
35 |
+
{'generated_text': '昨日私は京都でこみ上げてきたものを写真撮るため、駅近くのセン'}]
|
36 |
```
|
37 |
|
38 |
You can also use this model to get the features of a given text.
|
|
|
67 |
- weight_decay: 0.01
|
68 |
- lr_scheduler_type: linear
|
69 |
- max_grad_norm: 1.0
|
70 |
+
- max_steps: 500,000 (but terminated at 142,000 steps ~= 3.0 epochs)
|
71 |
- warmup_steps: 10,000
|
72 |
|
73 |
+
The eval loss was 1.597 while the eval accuracy was 0.6359. The evaluation set consists of 5,000 randomly sampled documents from each of the training corpora.
|