HF1BitLLM
/

Llama3-8B-1.58-Linear-10B-tokens

Text Generation

text-generation-inference

Inference Endpoints

8-bit precision

Model card Files Files and versions Community

medmekk HF staff commited on Sep 14, 2024

Commit

009a6fe

·

verified ·

1 Parent(s): 9be9140

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -81,7 +81,7 @@ The model was trained on a subset of [FineWeb-edu](https://huggingface.co/datase
    - Activations quantized to 8-bit precision
 10. **Key Findings**
-    - Warmup quantization (linear lambda scheduler) proved crucial for performance
 These 10B token training runs showed that it's possible to effectively fine-tune pre-trained models to 1.58-bit precision, achieving strong performance with relatively limited additional training data.

    - Activations quantized to 8-bit precision
 10. **Key Findings**
+    - Warmup quantization (sigmoid or linear lambda scheduler) proved crucial for performance
 These 10B token training runs showed that it's possible to effectively fine-tune pre-trained models to 1.58-bit precision, achieving strong performance with relatively limited additional training data.