Update README.md

0eed571 verified 3 months ago

2.96 kB

	---
	library_name: transformers
	license: apache-2.0
	pipeline_tag: text-generation
	---

	Converted version of [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) to 4-bit using bitsandbytes. For more information about the model,
	refer to the model's page.

	## Impact on performance
	Impact of quantization on a set of models.

	We evaluated the models using the PoLL (Pool of LLM) technique a panel of giga-models (GPT-4o, Gemini Pro 1.5, and Claude-Sonnet 3.5). The scoring ranged from 0,
	indicating a model unsuitable for the task, to 5, representing a model that fully met expectations. The evaluation was based on 67 instructions across four programming
	languages: Python, Java, JavaScript, and Pseudo-code. All tests were conducted in a French-language context, and models were heavily penalized if they responded in
	another language, even if the response was technically correct.

	Performance Scores (on a scale of 5):
	\| Model \| Score \| # params (Billion) \| size (GB) \|
	\|---------------------------------------------:\|:--------:\|:------------------:\|:---------:\|
	\| gemini-1.5-pro \| 4.51 \| NA \| NA \|
	\| gpt-4o \| 4.51 \| NA \| NA \|
	\| claude3.5-sonnet \| 4.49 \| NA \| NA \|
	\| deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct \| 4.24 \| 15.7 \| 31.4 \|
	\| meta-llama/Meta-Llama-3.1-70B-Instruct \| 4.23 \| 70.06 \| 141.2 \|
	\| cmarkea/Meta-Llama-3.1-70B-Instruct-4bit \| 4.14 \| 70.06 \| 35.3 \|
	\| Qwen/Qwen2.5-Coder-7B-Instruct \| 4.11 \| 7.62 \| 15.24 \|
	\| cmarkea/Qwen2.5-Coder-7B-Instruct-4bit \| 4.08 \| 7.62 \| 3.81 \|
	\| cmarkea/Mixtral-8x7B-Instruct-v0.1-4bit \| 3.8 \| 46.7 \| 23.35 \|
	\| meta-llama/Meta-Llama-3.1-8B-Instruct \| 3.73 \| 8.03 \| 16.06 \|
	\| mistralai/Mixtral-8x7B-Instruct-v0.1 \| 3.33 \| 46.7 \| 93.4 \|
	\| codellama/CodeLlama-13b-Instruct-hf \| 3.33 \| 13 \| 26 \|
	\| codellama/CodeLlama-34b-Instruct-hf \| 3.27 \| 33.7 \| 67.4 \|
	\| codellama/CodeLlama-7b-Instruct-hf \| 3.19 \| 6.74 \| 13.48 \|
	\| cmarkea/CodeLlama-34b-Instruct-hf-4bit \| 3.12 \| 33.7 \| 16.35 \|
	\| codellama/CodeLlama-70b-Instruct-hf \| 1.82 \| 69 \| 138 \|
	\| cmarkea/CodeLlama-70b-Instruct-hf-4bit \| 1.64 \| 69 \| 34.5 \|

	The impact of quantization is negligible.

	## Prompt Pattern
	Here is a reminder of the command pattern to interact with the model:
	```verbatim
	<\|im_start\|>user\n{user_prompt_1}<\|im_end\|><\|im_start\|>assistant\n{model_answer_1}<\|im_end\|>...
	```

	---
	library_name: transformers
	license: apache-2.0
	pipeline_tag: text-generation
	---

	Converted version of [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) to 4-bit using bitsandbytes. For more information about the model,
	refer to the model's page.

	## Impact on performance
	Impact of quantization on a set of models.

	We evaluated the models using the PoLL (Pool of LLM) technique a panel of giga-models (GPT-4o, Gemini Pro 1.5, and Claude-Sonnet 3.5). The scoring ranged from 0,
	indicating a model unsuitable for the task, to 5, representing a model that fully met expectations. The evaluation was based on 67 instructions across four programming
	languages: Python, Java, JavaScript, and Pseudo-code. All tests were conducted in a French-language context, and models were heavily penalized if they responded in
	another language, even if the response was technically correct.

	Performance Scores (on a scale of 5):
	\| Model \| Score \| # params (Billion) \| size (GB) \|
	\|---------------------------------------------:\|:--------:\|:------------------:\|:---------:\|
	\| gemini-1.5-pro \| 4.51 \| NA \| NA \|
	\| gpt-4o \| 4.51 \| NA \| NA \|
	\| claude3.5-sonnet \| 4.49 \| NA \| NA \|
	\| deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct \| 4.24 \| 15.7 \| 31.4 \|
	\| meta-llama/Meta-Llama-3.1-70B-Instruct \| 4.23 \| 70.06 \| 141.2 \|
	\| cmarkea/Meta-Llama-3.1-70B-Instruct-4bit \| 4.14 \| 70.06 \| 35.3 \|
	\| Qwen/Qwen2.5-Coder-7B-Instruct \| 4.11 \| 7.62 \| 15.24 \|
	\| cmarkea/Qwen2.5-Coder-7B-Instruct-4bit \| 4.08 \| 7.62 \| 3.81 \|
	\| cmarkea/Mixtral-8x7B-Instruct-v0.1-4bit \| 3.8 \| 46.7 \| 23.35 \|
	\| meta-llama/Meta-Llama-3.1-8B-Instruct \| 3.73 \| 8.03 \| 16.06 \|
	\| mistralai/Mixtral-8x7B-Instruct-v0.1 \| 3.33 \| 46.7 \| 93.4 \|
	\| codellama/CodeLlama-13b-Instruct-hf \| 3.33 \| 13 \| 26 \|
	\| codellama/CodeLlama-34b-Instruct-hf \| 3.27 \| 33.7 \| 67.4 \|
	\| codellama/CodeLlama-7b-Instruct-hf \| 3.19 \| 6.74 \| 13.48 \|
	\| cmarkea/CodeLlama-34b-Instruct-hf-4bit \| 3.12 \| 33.7 \| 16.35 \|
	\| codellama/CodeLlama-70b-Instruct-hf \| 1.82 \| 69 \| 138 \|
	\| cmarkea/CodeLlama-70b-Instruct-hf-4bit \| 1.64 \| 69 \| 34.5 \|

	The impact of quantization is negligible.

	## Prompt Pattern
	Here is a reminder of the command pattern to interact with the model:
	```verbatim
	<\|im_start\|>user\n{user_prompt_1}<\|im_end\|><\|im_start\|>assistant\n{model_answer_1}<\|im_end\|>...
	```