docs: update README.md

#34
by eltociear - opened
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -25,7 +25,7 @@ library_name: transformers
25
  - Training Stage: Pretraining & Post-training
26
  - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
27
  - Number of Parameters: 32.5B
28
- - Number of Paramaters (Non-Embedding): 31.0B
29
  - Number of Layers: 64
30
  - Number of Attention Heads (GQA): 40 for Q and 8 for KV
31
  - Context Length: Full 32,768 tokens
 
25
  - Training Stage: Pretraining & Post-training
26
  - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
27
  - Number of Parameters: 32.5B
28
+ - Number of Parameters (Non-Embedding): 31.0B
29
  - Number of Layers: 64
30
  - Number of Attention Heads (GQA): 40 for Q and 8 for KV
31
  - Context Length: Full 32,768 tokens