Qwen2.5-0.5B-DPO / README.md
co-gy's picture
Update README.md
e7b99b0 verified
---
datasets:
- Intel/orca_dpo_pairs
base_model:
- Qwen/Qwen2.5-0.5B-Instruct
license: apache-2.0
---
# Fine-tuned Qwen/Qwen2.5-0.5B-Instruct Model
## Model Overview
This is a fine-tuned version of the Qwen/Qwen2.5-0.5B-Instruct model. The fine-tuning process utilized the Intel/orca_dpo_pairs dataset and applied DPO (Direct Preference Optimization) and LoRA (Low-Rank Adaptation) techniques.
**Note**: This fine-tuning was done following the instructions in [this blog](https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac).
## Fine-tuning Details
- **Base Model**: Qwen/Qwen2.5-0.5B-Instruct
- **Dataset**: Intel/orca_dpo_pairs
- **Fine-tuning Method**: DPO + LoRA
## Usage Instructions
### Install Dependencies
Before using this model, make sure you have the following dependencies installed:
```bash
pip install transformers datasets
```
### Load the model
```python
import transformers
from transformers import AutoConfig, AutoModel, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("drive/MyDrive/result/Qwen-DPO")
message = [
{"role": "system", "content": "You are a helpful assistant chatbot."},
{"role": "user", "content": "What is a Large Language Model?"}
]
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
pipeline = transformers.pipeline(
"text-generation",
model="co-gy/Qwen-DPO",
tokenizer=tokenizer
)
sequences = pipeline(
prompt,
do_sample=True,
temperature=0.7,
top_p=0.9,
num_return_sequences=1,
max_length=200,
)
print(sequences[0]['generated_text'])
```