Shotaro30678's picture
Update README.md
77baa3a verified
|
raw
history blame
5.44 kB
---
base_model:
- unsloth/llama-2-7b-bnb-4bit
- hermeschen1116/response_generator_for_emotion_chat_bot
library_name: peft
license: apache-2.0
datasets:
- Shotaro30678/rlhf-RG-trl-style-v3
tags:
- trl
- unsloth
language:
- en
pipeline_tag: text-generation
---
# Response Generator for [Emotion Chat Bot](https://github.com/hermeschen1116/chat-bot)
## Model description
This model is a dpo fine-tuned version of [hermeschen1116/response_generator_for_emotion_chat_bot](https://huggingface.co/hermeschen1116/response_generator_for_emotion_chat_bot) on [Shotaro30678/rlhf-RG-trl-style-v3](https://huggingface.co/datasets/Shotaro30678/rlhf-RG-trl-style-v3), self modified version of [daily_dialog](li2017dailydialog/daily_dialog).
## Intended uses & limitations
Use dpo trainer to do the RLHF so that the model can be more precise and consistent.
## Model performance
### Model Comparison
**Sentiment Score:**
**[Shotaro30678/emotion_text_classifier_on_dd_v1](https://huggingface.co/Shotaro30678/emotion_text_classifier_on_dd_v1)**
| **Metric** | **DPO Trained Model** | **SFT Model (Reference)** |
|--------------|:----------------------:|:--------------------------:|
| **Accuracy** | 0.851 | 0.788 |
| **F1-score** | 0.8564 | 0.7975 |
**Gibberish Distribution:**
**[madhurjindal/autonlp-Gibberish-Detector-492513457](https://huggingface.co/madhurjindal/autonlp-Gibberish-Detector-492513457)**
| **Category** | **DPO Trained Model** | **SFT Model (Reference)** |
|---------------------|:----------------------:|:--------------------------:|
| **Clean** | 882 | 898 |
| **Mild Gibberish** | 94 | 58 |
| **Word Salad** | 21 | 33 |
| **Noise** | 3 | 11 |
**Cut-Off Output:**
| **Output Type** | **DPO Trained Model** | **SFT Model (Reference)** |
|---------------------|:----------------------:|:--------------------------:|
| **Complete Output** | 985 | 975 |
| **Incomplete Output** | 15 | 25 |
on [hermeschen1116/daily_dialog_for_RG](https://huggingface.co/datasets/hermeschen1116/daily_dialog_for_RG) test split.
**test on config:**
```python
generation_config = GenerationConfig(
max_new_tokens=150,
min_new_tokens=5,
repetition_penalty=1.1,
top_k=3,
top_p=0.9,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
temperature=1.0,
do_sample=True,
num_beams=1
)
```
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- beta=0.1,
- remove_unused_columns=False,
- num_train_epochs=3,
- gradient_checkpointing=True
others remain default
### Framework versions
- Bitsandbytes 0.43.1
- Datasets 2.20.0
- PEFT 0.11.1
- Pytorch 2.3.0+cu121
- Transformers 4.42.4
- Tokenizers 0.19.1
- Trl 0.8.6
- unsloth 2024.7 0f2e484
# Uploaded model
- **Developed by:** Shotaro30678
- **Finetuned from model :** hermeschen1116/response_generator_for_emotion_chat_bot
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
# Quick sample
```python
# libs are from github repo
from libs import ResponseGeneratorPipeline
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Shotaro30678/response_generator_DPO", # YOUR MODEL YOU USED FOR TRAINING
load_in_4bit = True,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
bot = ResponseGeneratorPipeline(
model,
tokenizer,
framework="pt",
task="conversation-generation",
num_workers=16,
torch_dtype="auto",
add_special_tokens=True,
truncation=False,
padding=True
)
conversation = [
{'content': {'dialog': '', 'emotion': ''}, 'role': 'system'},
{'content': {'dialog': 'Can you do push-ups ?', 'emotion': 'neutral'},
'role': 'user'},
{'content': {'dialog': "Of course I can . It's a piece of cake ! Believe it or not , I can do 30 push-ups a minute .",
'emotion': 'neutral'},
'role': 'assistant'},
{'content': {'dialog': "Really ? I think that's impossible !",
'emotion': 'surprise'},
'role': 'user'},
{'content': {'dialog': 'You mean 30 push-ups ?', 'emotion': 'neutral'},
'role': 'assistant'},
{'content': {'dialog': 'Yeah !', 'emotion': 'neutral'}, 'role': 'user'},
{'content': {'dialog': '', 'emotion': 'neutral'}, 'role': 'assistant'}
]
generation_config = GenerationConfig(
max_new_tokens=150,
min_new_tokens=5,
repetition_penalty=1.1,
top_k=3,
top_p=0.9,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
temperature=1.0,
do_sample=True,
num_beams=1
)
print(bot(conversation, generation_config=generation_config)[0]['generated_text'][-1]["content"]["dialog"])
```
**output:**
```
30 push-ups in a row?
```