model-attribution-challenge
/

opt-350m

Text Generation

text-generation-inference

Model card Files Files and versions Community

ArthurZ HF staff commited on May 12, 2022

Commit

b5f37f5

•

1 Parent(s): 9694c14

Update README.md

Files changed (1) hide show

README.md +1 -12

README.md CHANGED Viewed

@@ -166,19 +166,8 @@ re-formatting practices, including removing repetitive/non-informative text like
 The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a
 vocabulary size of 180B. The inputs are sequences of 2048 consecutive tokens.
-The larger model was trained on 992 *80GB A100 GPUs*. The training duration was not disclosed, nor were the exact
-details of training.
-## Evaluation results
-TODO
-The model achieves the following results without any fine-tuning (zero-shot):
-| Dataset  | LAMBADA | LAMBADA | CBT-CN | CBT-NE | WikiText2 | PTB    | enwiki8 | text8  | WikiText103 | 1BW   |
-|:--------:|:-------:|:-------:|:------:|:------:|:---------:|:------:|:-------:|:------:|:-----------:|:-----:|
-| (metric) | (PPL)   | (ACC)   | (ACC)  | (ACC)  | (PPL)     | (PPL)  | (BPB)   | (BPC)  | (PPL)       | (PPL) |
-|          | 35.13   | 45.99   | 87.65  | 83.4   | 29.41     | 65.85  | 1.16    | 1,17   | 37.50       | 75.20 |
 ### BibTeX entry and citation info

 The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a
 vocabulary size of 180B. The inputs are sequences of 2048 consecutive tokens.
+The larger model was trained on 992 *80GB A100 GPUs*. The training duration was roughly ~33 days of continuous training.
 ### BibTeX entry and citation info