File size: 5,437 Bytes
090bed7 4adb4e3 caf3580 090bed7 caf3580 090bed7 caf3580 090bed7 caf3580 090bed7 2910e15 77baa3a 2910e15 77baa3a 2910e15 77baa3a 2910e15 caf3580 090bed7 caf3580 090bed7 caf3580 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 |
---
base_model:
- unsloth/llama-2-7b-bnb-4bit
- hermeschen1116/response_generator_for_emotion_chat_bot
library_name: peft
license: apache-2.0
datasets:
- Shotaro30678/rlhf-RG-trl-style-v3
tags:
- trl
- unsloth
language:
- en
pipeline_tag: text-generation
---
# Response Generator for [Emotion Chat Bot](https://github.com/hermeschen1116/chat-bot)
## Model description
This model is a dpo fine-tuned version of [hermeschen1116/response_generator_for_emotion_chat_bot](https://huggingface.co/hermeschen1116/response_generator_for_emotion_chat_bot) on [Shotaro30678/rlhf-RG-trl-style-v3](https://huggingface.co/datasets/Shotaro30678/rlhf-RG-trl-style-v3), self modified version of [daily_dialog](li2017dailydialog/daily_dialog).
## Intended uses & limitations
Use dpo trainer to do the RLHF so that the model can be more precise and consistent.
## Model performance
### Model Comparison
**Sentiment Score:**
**[Shotaro30678/emotion_text_classifier_on_dd_v1](https://huggingface.co/Shotaro30678/emotion_text_classifier_on_dd_v1)**
| **Metric** | **DPO Trained Model** | **SFT Model (Reference)** |
|--------------|:----------------------:|:--------------------------:|
| **Accuracy** | 0.851 | 0.788 |
| **F1-score** | 0.8564 | 0.7975 |
**Gibberish Distribution:**
**[madhurjindal/autonlp-Gibberish-Detector-492513457](https://huggingface.co/madhurjindal/autonlp-Gibberish-Detector-492513457)**
| **Category** | **DPO Trained Model** | **SFT Model (Reference)** |
|---------------------|:----------------------:|:--------------------------:|
| **Clean** | 882 | 898 |
| **Mild Gibberish** | 94 | 58 |
| **Word Salad** | 21 | 33 |
| **Noise** | 3 | 11 |
**Cut-Off Output:**
| **Output Type** | **DPO Trained Model** | **SFT Model (Reference)** |
|---------------------|:----------------------:|:--------------------------:|
| **Complete Output** | 985 | 975 |
| **Incomplete Output** | 15 | 25 |
on [hermeschen1116/daily_dialog_for_RG](https://huggingface.co/datasets/hermeschen1116/daily_dialog_for_RG) test split.
**test on config:**
```python
generation_config = GenerationConfig(
max_new_tokens=150,
min_new_tokens=5,
repetition_penalty=1.1,
top_k=3,
top_p=0.9,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
temperature=1.0,
do_sample=True,
num_beams=1
)
```
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- beta=0.1,
- remove_unused_columns=False,
- num_train_epochs=3,
- gradient_checkpointing=True
others remain default
### Framework versions
- Bitsandbytes 0.43.1
- Datasets 2.20.0
- PEFT 0.11.1
- Pytorch 2.3.0+cu121
- Transformers 4.42.4
- Tokenizers 0.19.1
- Trl 0.8.6
- unsloth 2024.7 0f2e484
# Uploaded model
- **Developed by:** Shotaro30678
- **Finetuned from model :** hermeschen1116/response_generator_for_emotion_chat_bot
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
# Quick sample
```python
# libs are from github repo
from libs import ResponseGeneratorPipeline
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Shotaro30678/response_generator_DPO", # YOUR MODEL YOU USED FOR TRAINING
load_in_4bit = True,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
bot = ResponseGeneratorPipeline(
model,
tokenizer,
framework="pt",
task="conversation-generation",
num_workers=16,
torch_dtype="auto",
add_special_tokens=True,
truncation=False,
padding=True
)
conversation = [
{'content': {'dialog': '', 'emotion': ''}, 'role': 'system'},
{'content': {'dialog': 'Can you do push-ups ?', 'emotion': 'neutral'},
'role': 'user'},
{'content': {'dialog': "Of course I can . It's a piece of cake ! Believe it or not , I can do 30 push-ups a minute .",
'emotion': 'neutral'},
'role': 'assistant'},
{'content': {'dialog': "Really ? I think that's impossible !",
'emotion': 'surprise'},
'role': 'user'},
{'content': {'dialog': 'You mean 30 push-ups ?', 'emotion': 'neutral'},
'role': 'assistant'},
{'content': {'dialog': 'Yeah !', 'emotion': 'neutral'}, 'role': 'user'},
{'content': {'dialog': '', 'emotion': 'neutral'}, 'role': 'assistant'}
]
generation_config = GenerationConfig(
max_new_tokens=150,
min_new_tokens=5,
repetition_penalty=1.1,
top_k=3,
top_p=0.9,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
temperature=1.0,
do_sample=True,
num_beams=1
)
print(bot(conversation, generation_config=generation_config)[0]['generated_text'][-1]["content"]["dialog"])
```
**output:**
```
30 push-ups in a row?
``` |