ThomasBaruzier
/

Meta-Llama-3.1-70B-Instruct-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

ThomasBaruzier commited on Aug 9

Commit

38571e2

•

1 Parent(s): cbea6b0

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -198,7 +198,7 @@ extra_gated_button_content: Submit
 Using llama.cpp commit [b5e9546](https://github.com/ggerganov/llama.cpp/commit/b5e95468b1676e1e5c9d80d1eeeb26f542a38f42) for quantization, featuring llama 3.1 rope scaling factors. This fixes low-quality issues when using 8-128k context lengths.
-Original model: [https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)
 All quants were made using the imatrix option and Bartowski's [calibration file](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8).

 Using llama.cpp commit [b5e9546](https://github.com/ggerganov/llama.cpp/commit/b5e95468b1676e1e5c9d80d1eeeb26f542a38f42) for quantization, featuring llama 3.1 rope scaling factors. This fixes low-quality issues when using 8-128k context lengths.
+Original model: https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct
 All quants were made using the imatrix option and Bartowski's [calibration file](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8).