davidxmle commited on
Commit
9a115a3
1 Parent(s): eb2df45

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -42,6 +42,11 @@ datasets:
42
  - Built with Meta Llama 3
43
  - Quantized by [Astronomer](https://astronomer.io)
44
 
 
 
 
 
 
45
  ## Important Note About Serving with vLLM & oobabooga/text-generation-webui
46
  - For loading this model onto vLLM, make sure all requests have `"stop_token_ids":[128001, 128009]` to temporarily address the non-stop generation issue.
47
  - vLLM does not yet respect `generation_config.json`.
 
42
  - Built with Meta Llama 3
43
  - Quantized by [Astronomer](https://astronomer.io)
44
 
45
+ ## MUST READ: Very Important!! Note About Untrained Special Tokens in Llama 3 Base (Non-instruct) Models & Fine-tuning Llama 3 Base
46
+ - Special tokens such as the ones used for instruct are undertrained in Llama 3 base models. (discovered by Daniel Han https://twitter.com/danielhanchen/status/1781395882925343058)
47
+ - ![image/png](https://cdn-uploads.huggingface.co/production/uploads/655ad0f8727df37c77a09cb9/1U2rRrx60p1pNeeAZw8Rd.png)
48
+ - A patch function is under way, fine-tuning this model for instruction following may cause `NaN` graidents unless this problem is addressed.
49
+
50
  ## Important Note About Serving with vLLM & oobabooga/text-generation-webui
51
  - For loading this model onto vLLM, make sure all requests have `"stop_token_ids":[128001, 128009]` to temporarily address the non-stop generation issue.
52
  - vLLM does not yet respect `generation_config.json`.