togethercomputer
/

Llama-2-7B-32K-Instruct

@@ -80,6 +80,18 @@ Their personalities, so diverse,
 Their charm, a gift, that's forever told.
 ```
 ## Limitations and Bias
 As with all language models, LLaMA-2-7B-32K-Chat may generate incorrect or biased content. It's important to keep this in mind when using the model.

 Their charm, a gift, that's forever told.
 ```
+## Model Evaluation
+We evaluate the model with [PG19 dataset](https://huggingface.co/datasets/pg19) and compare the perplexity with [Llama-2-7b-chat](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf),
+the results are summarized below (note that the perplexity is normalized following the protocol [here](https://together.ai/blog/llama-2-7b-32k)).
+| Model | 2K Seq | 4K Seq | 8K Seq | 16K Seq | 32K Seq |
+| -------- | ------- | ------- | ------- | ------- | ------- |
+| LLaMA-2-7B-Chat (Meta) | 1.844 | 1.833 | N/A | N/A | N/A |
+| LLaMA-2-7B-32K-Chat (ours) | 1.813 | 1.798 | 1.781 | 1.778 | 1.772|
+We observe that LLaMA-2-7B-32K-Chat obtains reasonable (and even better) perplexity, comparable to the original LLaMA-2-7B-Chat model.
 ## Limitations and Bias
 As with all language models, LLaMA-2-7B-32K-Chat may generate incorrect or biased content. It's important to keep this in mind when using the model.