amanpreetsingh459 commited on
Commit
ae9ad79
1 Parent(s): 2ee211e

Specify the correct llama model name

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -3,7 +3,7 @@ license: mit
3
  ---
4
 
5
  # llama-2-7b-chat_q4_quantized_cpp
6
- - This model contains the 4-bit quantized version of [llama2](https://github.com/facebookresearch/llama) model in cpp.
7
  - This can be run on a local cpu system as a cpp module *(instructions for the same are given below)*.
8
  - As for the testing, the model has been tested on `Linux(Ubuntu)` os with `12 GB RAM` and `core i5 processor`.
9
  - The performance is `roughly` **907.46 ms per token**, **1.10 tokens per second**
 
3
  ---
4
 
5
  # llama-2-7b-chat_q4_quantized_cpp
6
+ - This model contains the 4-bit quantized version of [llama2-7B-chat](https://github.com/facebookresearch/llama) model in cpp.
7
  - This can be run on a local cpu system as a cpp module *(instructions for the same are given below)*.
8
  - As for the testing, the model has been tested on `Linux(Ubuntu)` os with `12 GB RAM` and `core i5 processor`.
9
  - The performance is `roughly` **907.46 ms per token**, **1.10 tokens per second**