Update README.md
Browse files
README.md
CHANGED
@@ -42,6 +42,11 @@ datasets:
|
|
42 |
- Built with Meta Llama 3
|
43 |
- Quantized by [Astronomer](https://astronomer.io)
|
44 |
|
|
|
|
|
|
|
|
|
|
|
45 |
## Important Note About Serving with vLLM & oobabooga/text-generation-webui
|
46 |
- For loading this model onto vLLM, make sure all requests have `"stop_token_ids":[128001, 128009]` to temporarily address the non-stop generation issue.
|
47 |
- vLLM does not yet respect `generation_config.json`.
|
|
|
42 |
- Built with Meta Llama 3
|
43 |
- Quantized by [Astronomer](https://astronomer.io)
|
44 |
|
45 |
+
## MUST READ: Very Important!! Note About Untrained Special Tokens in Llama 3 Base (Non-instruct) Models & Fine-tuning Llama 3 Base
|
46 |
+
- Special tokens such as the ones used for instruct are undertrained in Llama 3 base models. (discovered by Daniel Han https://twitter.com/danielhanchen/status/1781395882925343058)
|
47 |
+
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/655ad0f8727df37c77a09cb9/1U2rRrx60p1pNeeAZw8Rd.png)
|
48 |
+
- A patch function is under way, fine-tuning this model for instruction following may cause `NaN` graidents unless this problem is addressed.
|
49 |
+
|
50 |
## Important Note About Serving with vLLM & oobabooga/text-generation-webui
|
51 |
- For loading this model onto vLLM, make sure all requests have `"stop_token_ids":[128001, 128009]` to temporarily address the non-stop generation issue.
|
52 |
- vLLM does not yet respect `generation_config.json`.
|