Update README.md

2910e15 verified 3 months ago

5.43 kB

	---
	base_model:
	- unsloth/llama-2-7b-bnb-4bit
	- hermeschen1116/response_generator_for_emotion_chat_bot
	library_name: peft
	license: apache-2.0
	datasets:
	- Shotaro30678/rlhf-RG-trl-style-v3
	tags:
	- trl
	- unsloth
	language:
	- en
	pipeline_tag: text-generation

	---
	# Response Generator for [Emotion Chat Bot](https://github.com/hermeschen1116/chat-bot)


	## Model description

	This model is a dpo fine-tuned version of [hermeschen1116/response_generator_for_emotion_chat_bot](https://huggingface.co/hermeschen1116/response_generator_for_emotion_chat_bot) on [Shotaro30678/rlhf-RG-trl-style-v3](https://huggingface.co/datasets/Shotaro30678/rlhf-RG-trl-style-v3), self modified version of [daily_dialog](li2017dailydialog/daily_dialog).

	## Intended uses & limitations

	Use dpo trainer to do the RLHF so that the model can be more precise and consistent.

	## Model performance

	### Model Comparison

	Sentiment Score:
	[Shotaro30678/emotion_text_classifier_on_dd_v1](https://huggingface.co/Shotaro30678/emotion_text_classifier_on_dd_v1)

	\| Metric \| DPO Trained Model \| SFT Model (Reference) \|
	\|--------------\|-----------------------\|---------------------------\|
	\| Accuracy \| 0.851 \| 0.788 \|
	\| F1-score \| 0.8564 \| 0.7975 \|

	Gibberish Distribution:
	[madhurjindal/autonlp-Gibberish-Detector-492513457](https://huggingface.co/madhurjindal/autonlp-Gibberish-Detector-492513457)

	\| Category \| DPO Trained Model \| SFT Model (Reference) \|
	\|---------------------\|-----------------------\|---------------------------\|
	\| Clean \| 882 \| 898 \|
	\| Mild Gibberish \| 94 \| 58 \|
	\| Word Salad \| 21 \| 33 \|
	\| Noise \| 3 \| 11 \|

	Cut-Off Output:

	\| Output Type \| DPO Trained Model \| SFT Model (Reference) \|
	\|---------------------\|-----------------------\|---------------------------\|
	\| Complete Output \| 985 \| 975 \|
	\| Incomplete Output \| 15 \| 25 \|

	on [hermeschen1116/daily_dialog_for_RG](https://huggingface.co/datasets/hermeschen1116/daily_dialog_for_RG) test split.

	test on config:
	```python
	generation_config = GenerationConfig(
	max_new_tokens=150,
	min_new_tokens=5,
	repetition_penalty=1.1,
	top_k=3,
	top_p=0.9,
	pad_token_id=tokenizer.pad_token_id,
	eos_token_id=tokenizer.eos_token_id,
	temperature=1.0,
	do_sample=True,
	num_beams=1
	)
	```
	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- beta=0.1,
	- remove_unused_columns=False,
	- num_train_epochs=3,
	- gradient_checkpointing=True

	others remain default

	### Framework versions

	- Bitsandbytes 0.43.1
	- Datasets 2.20.0
	- PEFT 0.11.1
	- Pytorch 2.3.0+cu121
	- Transformers 4.42.4
	- Tokenizers 0.19.1
	- Trl 0.8.6
	- unsloth 2024.7 0f2e484

	# Uploaded model

	- Developed by: Shotaro30678
	- Finetuned from model : hermeschen1116/response_generator_for_emotion_chat_bot

	This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

	# Quick sample
	```python
	# libs are from github repo
	from libs import ResponseGeneratorPipeline
	from unsloth import FastLanguageModel
	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name = "Shotaro30678/response_generator_DPO", # YOUR MODEL YOU USED FOR TRAINING
	load_in_4bit = True,
	)
	FastLanguageModel.for_inference(model) # Enable native 2x faster inference

	bot = ResponseGeneratorPipeline(
	model,
	tokenizer,
	framework="pt",
	task="conversation-generation",
	num_workers=16,
	torch_dtype="auto",
	add_special_tokens=True,
	truncation=False,
	padding=True
	)

	conversation = [
	{'content': {'dialog': '', 'emotion': ''}, 'role': 'system'},
	{'content': {'dialog': 'Can you do push-ups ?', 'emotion': 'neutral'},
	'role': 'user'},
	{'content': {'dialog': "Of course I can . It's a piece of cake ! Believe it or not , I can do 30 push-ups a minute .",
	'emotion': 'neutral'},
	'role': 'assistant'},
	{'content': {'dialog': "Really ? I think that's impossible !",
	'emotion': 'surprise'},
	'role': 'user'},
	{'content': {'dialog': 'You mean 30 push-ups ?', 'emotion': 'neutral'},
	'role': 'assistant'},
	{'content': {'dialog': 'Yeah !', 'emotion': 'neutral'}, 'role': 'user'},
	{'content': {'dialog': '', 'emotion': 'neutral'}, 'role': 'assistant'}
	]

	generation_config = GenerationConfig(
	max_new_tokens=150,
	min_new_tokens=5,
	repetition_penalty=1.1,
	top_k=3,
	top_p=0.9,
	pad_token_id=tokenizer.pad_token_id,
	eos_token_id=tokenizer.eos_token_id,
	temperature=1.0,
	do_sample=True,
	num_beams=1
	)

	print(bot(conversation, generation_config=generation_config)[0]['generated_text'][-1]["content"]["dialog"])
	```
	output:
	```
	30 push-ups in a row?
	```