DeepSeek-R1-Distill-Llama-3B / README.md

Update README.md

451b022 verified 12 days ago

7.48 kB

	---
	language:
	- en
	license: mit
	library_name: transformers
	tags:
	- reasoning
	- axolotl
	- r1
	base_model:
	- meta-llama/Llama-3.2-3B-Instruct
	datasets:
	- ServiceNow-AI/R1-Distill-SFT
	pipeline_tag: text-generation
	model-index:
	- name: DeepSeek-R1-Distill-Llama-3B
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: IFEval (0-Shot)
	type: HuggingFaceH4/ifeval
	args:
	num_few_shot: 0
	metrics:
	- type: inst_level_strict_acc and prompt_level_strict_acc
	value: 70.93
	name: strict accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BBH (3-Shot)
	type: BBH
	args:
	num_few_shot: 3
	metrics:
	- type: acc_norm
	value: 21.45
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MATH Lvl 5 (4-Shot)
	type: hendrycks/competition_math
	args:
	num_few_shot: 4
	metrics:
	- type: exact_match
	value: 20.92
	name: exact match
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GPQA (0-shot)
	type: Idavidrein/gpqa
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 1.45
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MuSR (0-shot)
	type: TAUR-Lab/MuSR
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 2.91
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU-PRO (5-shot)
	type: TIGER-Lab/MMLU-Pro
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 21.98
	name: accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B
	name: Open LLM Leaderboard
	---

	# DeepSeek-R1-Distill-Llama-3B

	This model is the distilled version of DeepSeek-R1 on Llama-3.2-3B with R1-Distill-SFT dataset.

	[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)

	<details><summary>See axolotl config</summary>

	```yaml
	base_model: unsloth/Llama-3.2-3B-Instruct
	model_type: AutoModelForCausalLM
	tokenizer_type: AutoTokenizer

	load_in_8bit: true
	load_in_4bit: false
	strict: false

	chat_template: llama3
	datasets:
	- path: ./custom_dataset.json
	type: chat_template
	conversation: chatml
	ds_type: json

	add_bos_token: true
	add_eos_token: true
	use_default_system_prompt: false

	special_tokens:
	bos_token: "<\|begin_of_text\|>"
	eos_token: "<\|eot_id\|>"
	pad_token: "<\|eot_id\|>"
	additional_special_tokens:
	- "<\|begin_of_text\|>"
	- "<\|eot_id\|>"

	adapter: lora
	lora_model_dir:
	lora_r: 16
	lora_alpha: 32
	lora_dropout: 0.1
	lora_target_linear: true

	hub_model_id: suayptalha/DeepSeek-R1-Distill-Llama-3B

	sequence_len: 2048
	sample_packing: false
	pad_to_sequence_len: true
	micro_batch_size: 2
	gradient_accumulation_steps: 8
	num_epochs: 1
	learning_rate: 2e-5
	optimizer: paged_adamw_8bit
	lr_scheduler: cosine

	train_on_inputs: false
	group_by_length: false
	bf16: false
	fp16: true
	tf32: false

	gradient_checkpointing: true
	flash_attention: false

	logging_steps: 50
	warmup_steps: 100
	saves_per_epoch: 1

	output_dir: ./finetune-sft-results
	save_safetensors: true
	```
	</details><br>

	# Prompt Template

	You can use Llama3 prompt template while using the model:

	### Llama3

	```
	<\|start_header_id\|>system<\|end_header_id\|>
	{system}<\|eot_id\|>

	<\|start_header_id\|>user<\|end_header_id\|>
	{user}<\|eot_id\|>

	<\|start_header_id\|>assistant<\|end_header_id\|>
	{assistant}<\|eot_id\|>
	```

	## Example usage:

	```py
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained(
	"suayptalha/DeepSeek-R1-Distill-Llama-3B",
	device_map="auto"
	)

	tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B")

	SYSTEM_PROMPT = """Respond in the following format:
	<think>
	You should reason between these tags.
	</think>

	Answer goes here...

	Always use <think> </think> tags even if they are not necessary.
	"""

	messages = [
	{"role": "system", "content": SYSTEM_PROMPT},
	{"role": "user", "content": "Which one is larger? 9.11 or 9.9?"},
	]
	inputs = tokenizer.apply_chat_template(
	messages,
	tokenize = True,
	add_generation_prompt = True,
	return_tensors = "pt",
	).to("cuda")
	output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7)
	decoded_output = tokenizer.decode(output[0], skip_special_tokens=False)
	print(decoded_output)
	```

	## Output:
	```
	<think>
	First, I need to compare the two numbers 9.11 and 9.9.

	Next, I'll analyze each number. The first digit after the decimal point in 9.11 is 1, and in 9.9, it's 9.

	Since 9 is greater than 1, 9.9 is larger than 9.11.
	</think>

	To determine which number is larger, let's compare the two numbers:

	9.11 and 9.9

	1. Identify the Decimal Places:
	- Both numbers have two decimal places.

	2. Compare the Tens Place (Right of the Decimal Point):
	- 9.11: The tens place is 1.
	- 9.9: The tens place is 9.

	3. Conclusion:
	- Since 9 is greater than 1, the number with the larger tens place is 9.9.

	Answer: 9.9 is larger than 9.11.
	```


	## Suggested system prompt:
	```
	Respond in the following format:
	<think>
	You should reason between these tags.
	</think>

	Answer goes here...

	Always use <think> </think> tags even if they are not necessary.
	```

	# Parameters
	- lr: 2e-5
	- epochs: 1
	- batch_size: 16
	- optimizer: paged_adamw_8bit

	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/suayptalha__DeepSeek-R1-Distill-Llama-3B-details)

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|23.27\|
	\|IFEval (0-Shot) \|70.93\|
	\|BBH (3-Shot) \|21.45\|
	\|MATH Lvl 5 (4-Shot)\|20.92\|
	\|GPQA (0-shot) \| 1.45\|
	\|MuSR (0-shot) \| 2.91\|
	\|MMLU-PRO (5-shot) \|21.98\|

	# Support

	<a href="https://www.buymeacoffee.com/suayptalha" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>