Dorna-Llama3-8B-Instruct-GGUF / README.md

Update README.md

025b73a verified 4 months ago

7.29 kB

	---
	license: llama3
	language:
	- fa
	- en
	library_name: transformers
	tags:
	- LLM
	- llama-3
	- PartAI
	- conversational
	---

	# Model Details

	The Dorna models are a family of decoder-only models, specifically trained/fine-tuned on Persian data, developed by [Part AI](https://partdp.ai/). As an initial release, an 8B instruct model from this family is
	Dorna-Llama3-8B-Instruct is built using the [Meta Llama 3 Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.

	In this repo, we provide `bf16` model and quantized models in the GGUF formats, including `Q2_K`, `Q3_K`, `Q3_K_L`, `Q3_K_M`, `Q3_K_S`, `Q4_0`, `Q4_1`, `Q4_K_M`, `Q4_K_S`, `Q5_0`, `Q5_1`, `Q5_K_M`, `Q5_K_S` and `Q8_0`

	[Here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) offers an in-depth report that includes several performance charts. Check it out.

	<style>
	table td {
	padding-right: 30px;
	padding-left: 30px;
	color: #000;
	}
	th {
	color: #000;
	}
	a {
	color: #000;
	}
	</style>

	<table style="border-spacing: 30px; text-align: center;">
	<tr style="background-color:#f2f2f2;">
	<th>Name</th>
	<th>Quant Method</th>
	<th>Bits</th>
	<th>Memory</th>
	</tr>
	<tr style="background-color:#e0f7fa; " >
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q2_K.gguf">dorna-llama3-8b-instruct.Q2_K.gguf</a></td>
	<td>Q2_K</td>
	<td>2</td>
	<td>3.2 GB</td>
	</tr>
	<tr style="background-color:#e8f5e9;">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q3_K_L.gguf">dorna-llama3-8b-instruct.Q3_K_L.gguf</a></td>
	<td>Q3_K_L</td>
	<td>3</td>
	<td>4.3 GB</td>
	</tr>
	<tr style="background-color:#e8f5e9;">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q3_K_M.gguf">dorna-llama3-8b-instruct.Q3_K_M.gguf</a></td>
	<td>Q3_K_M</td>
	<td>3</td>
	<td>4.1 GB</td>
	</tr>
	<tr style="background-color:#e8f5e9;">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q3_K_S.gguf">dorna-llama3-8b-instruct.Q3_K_S.gguf</a></td>
	<td>Q3_K_S</td>
	<td>3</td>
	<td>3.7 GB</td>
	</tr>
	<tr style="background-color:#fff3e0;">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q4_0.gguf">dorna-llama3-8b-instruct.Q4_0.gguf</a></td>
	<td>Q4_1</td>
	<td>4</td>
	<td>4.7 GB</td>
	</tr>
	<tr style="background-color:#fff3e0;">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q4_1.gguf">dorna-llama3-8b-instruct.Q4_1.gguf</a></td>
	<td>Q4_1</td>
	<td>4</td>
	<td>5.2 GB</td>
	</tr>
	<tr style="background-color:#fff3e0;">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q4_K_M.gguf">dorna-llama3-8b-instruct.Q4_K_M.gguf</a></td>
	<td>Q4_K_M</td>
	<td>4</td>
	<td>4.9 GB</td>
	</tr>
	<tr style="background-color:#fff3e0;">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q4_K_S.gguf">dorna-llama3-8b-instruct.Q4_K_S.gguf</a></td>
	<td>Q4_K_S</td>
	<td>4</td>
	<td>4.7 GB</td>
	</tr>
	<tr style="background-color:#ffe0b2; ">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q5_0.gguf">dorna-llama3-8b-instruct.Q5_0.gguf</a></td>
	<td>Q5_0</td>
	<td>5</td>
	<td>5.6 GB</td>
	</tr>
	<tr style="background-color:#ffe0b2; ">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q5_1.gguf">dorna-llama3-8b-instruct.Q5_1.gguf</a></td>
	<td>Q5_1</td>
	<td>5</td>
	<td>6.1 GB</td>
	</tr>
	<tr style="background-color:#ffe0b2; ">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q5_K_M.gguf">dorna-llama3-8b-instruct.Q5_K_M.gguf</a></td>
	<td>Q5_K_M</td>
	<td>5</td>
	<td>5.73 GB</td>
	</tr>
	<tr style="background-color:#ffe0b2; ">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q5_K_S.gguf">dorna-llama3-8b-instruct.Q5_K_S.gguf</a></td>
	<td>Q5_K_S</td>
	<td>5</td>
	<td>5.6 GB</td>
	</tr>
	<tr style="background-color:#e1bee7; ">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q6_K.gguf">dorna-llama3-8b-instruct.Q6_K.gguf</a></td>
	<td>Q6_K</td>
	<td>6</td>
	<td>6.6 GB</td>
	</tr>
	<tr style="background-color:#c5cae9;">
	<td style="text-align: left;">
	<a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.Q8_0.gguf">dorna-llama3-8b-instruct.Q8_0.gguf</a>
	<span style="background-color: #4CAF50; color: white; padding: 2px 8px; margin-left: 10px; border-radius: 4px; font-size: 12px;">Recommended</span>
	</td>
	<td>Q8_0</td>
	<td>8</td>
	<td>8.5 GB</td>
	</tr>

	<tr style="background-color:#b2dfdb;">
	<td style="text-align: left;"><a href="https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/blob/main/dorna-llama3-8b-instruct.bf16.gguf">dorna-llama3-8b-instruct.bf16.gguf</a></td>
	<td>None</td>
	<td>16</td>
	<td>16.2 GB</td>
	</tr>
	</table>


	## Requirements
	We recommend using the Python version of [`llama.cpp`](https://github.com/ggerganov/llama.cpp) and installing it with the following command:

	```bash
	!pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.2.78/llama_cpp_python-0.2.78-cp310-cp310-linux_x86_64.whl
	```

	## How to use
	Instead of cloning the repository, which may be inefficient, you can manually download the required GGUF file or use `huggingface-cli` (`pip install huggingface_hub`) as demonstrated below:

	```bash
	!huggingface-cli login --token $HUGGING_FACE_HUB_TOKEN
	!huggingface-cli download PartAI/Dorna-Llama3-8B-Instruct-GGUF dorna-llama3-8b-instruct.Q8_0.gguf --local-dir . --local-dir-use-symlinks False
	```


	```Python
	from llama_cpp import Llama

	llm = Llama(
	model_path="dorna-llama3-8b-instruct.Q8_0.gguf",
	chat_format="llama-3",
	n_gpu_layers=-1,
	n_ctx=2048,

	)

	messages = [
	{"role": "system", "content": "You are a helpful Persian assistant. Please answer questions in the asked language."},
	{"role": "user", "content": "کاغذ A4 بزرگ تر است یا A5؟"},
	]
	result = llm.create_chat_completion(
	messages = messages,
	top_p=0.85,
	temperature=0.1

	)

	print(result)
	```


	## Contact us

	If you have any questions regarding this model, you can reach us via the [community](https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/discussions) on Hugging Face.