co-gy
/

Qwen2.5-0.5B-DPO

Model card Files Files and versions Community

Qwen2.5-0.5B-DPO / README.md

co-gy's picture

Update README.md

e7b99b0 verified 4 months ago

|

history blame contribute delete

1.64 kB

	---
	datasets:
	- Intel/orca_dpo_pairs
	base_model:
	- Qwen/Qwen2.5-0.5B-Instruct
	license: apache-2.0
	---
	# Fine-tuned Qwen/Qwen2.5-0.5B-Instruct Model

	## Model Overview

	This is a fine-tuned version of the Qwen/Qwen2.5-0.5B-Instruct model. The fine-tuning process utilized the Intel/orca_dpo_pairs dataset and applied DPO (Direct Preference Optimization) and LoRA (Low-Rank Adaptation) techniques.

	Note: This fine-tuning was done following the instructions in [this blog](https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac).

	## Fine-tuning Details

	- Base Model: Qwen/Qwen2.5-0.5B-Instruct
	- Dataset: Intel/orca_dpo_pairs
	- Fine-tuning Method: DPO + LoRA

	## Usage Instructions

	### Install Dependencies

	Before using this model, make sure you have the following dependencies installed:

	```bash
	pip install transformers datasets
	```

	### Load the model

	```python
	import transformers
	from transformers import AutoConfig, AutoModel, AutoTokenizer

	tokenizer = AutoTokenizer.from_pretrained("drive/MyDrive/result/Qwen-DPO")

	message = [
	{"role": "system", "content": "You are a helpful assistant chatbot."},
	{"role": "user", "content": "What is a Large Language Model?"}
	]
	prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)

	pipeline = transformers.pipeline(
	"text-generation",
	model="co-gy/Qwen-DPO",
	tokenizer=tokenizer
	)

	sequences = pipeline(
	prompt,
	do_sample=True,
	temperature=0.7,
	top_p=0.9,
	num_return_sequences=1,
	max_length=200,
	)
	print(sequences[0]['generated_text'])
	```