Text Generation
Transformers
Safetensors
English
stablelm
conversational
Inference Endpoints
euclaise commited on
Commit
ec58051
1 Parent(s): 00f4710

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -111,7 +111,7 @@ As I expected, it improves GSM8K, but doesn't do much to ARC.
111
  - Training sequence length: 256
112
  - Input masking probability: 40%
113
  - Label masking probability: 10%
114
- - Answer-only (full rationale masking) probability: 10%
115
  - Batch size: 16, accumulated to 256
116
  - Epochs: 6
117
  - Learning rate: 1e-5
 
111
  - Training sequence length: 256
112
  - Input masking probability: 40%
113
  - Label masking probability: 10%
114
+ - Answer-only (full rationale label masking) probability: 10%
115
  - Batch size: 16, accumulated to 256
116
  - Epochs: 6
117
  - Learning rate: 1e-5