Update README.md
Browse files
README.md
CHANGED
@@ -102,11 +102,33 @@ model-index:
|
|
102 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Mistral-Nemo-Instruct-2407
|
103 |
name: Open LLM Leaderboard
|
104 |
---
|
|
|
|
|
|
|
|
|
105 |
|
106 |
-
|
107 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
108 |
|
109 |
-
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
|
110 |
<details><summary>See axolotl config</summary>
|
111 |
|
112 |
axolotl version: `0.4.1`
|
@@ -191,62 +213,74 @@ save_safetensors: true
|
|
191 |
|
192 |
</details><br>
|
193 |
|
194 |
-
#
|
195 |
|
196 |
-
|
197 |
|
198 |
-
|
199 |
|
200 |
-
|
|
|
|
|
201 |
|
202 |
-
|
|
|
203 |
|
204 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
205 |
|
206 |
-
|
207 |
|
208 |
-
|
|
|
|
|
|
|
|
|
209 |
|
210 |
-
|
211 |
|
212 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
213 |
|
214 |
-
The following hyperparameters were used during training:
|
215 |
-
- learning_rate: 0.0002
|
216 |
-
- train_batch_size: 2
|
217 |
-
- eval_batch_size: 8
|
218 |
-
- seed: 42
|
219 |
-
- distributed_type: multi-GPU
|
220 |
-
- num_devices: 2
|
221 |
-
- gradient_accumulation_steps: 8
|
222 |
-
- total_train_batch_size: 32
|
223 |
-
- total_eval_batch_size: 16
|
224 |
-
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
225 |
-
- lr_scheduler_type: cosine
|
226 |
-
- lr_scheduler_warmup_steps: 10
|
227 |
-
- training_steps: 341
|
228 |
|
229 |
-
|
230 |
|
|
|
231 |
|
|
|
|
|
232 |
|
233 |
-
|
234 |
|
235 |
-
-
|
236 |
-
- Transformers 4.45.1
|
237 |
-
- Pytorch 2.3.1+cu121
|
238 |
-
- Datasets 2.21.0
|
239 |
-
- Tokenizers 0.20.0
|
240 |
-
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
241 |
-
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_HumanLLMs__Humanish-Mistral-Nemo-Instruct-2407)
|
242 |
|
243 |
-
|
244 |
-
|-------------------|----:|
|
245 |
-
|Avg. |22.88|
|
246 |
-
|IFEval (0-Shot) |54.51|
|
247 |
-
|BBH (3-Shot) |32.71|
|
248 |
-
|MATH Lvl 5 (4-Shot)| 7.63|
|
249 |
-
|GPQA (0-shot) | 5.03|
|
250 |
-
|MuSR (0-shot) | 9.40|
|
251 |
-
|MMLU-PRO (5-shot) |28.01|
|
252 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
102 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Mistral-Nemo-Instruct-2407
|
103 |
name: Open LLM Leaderboard
|
104 |
---
|
105 |
+
<div align="center">
|
106 |
+
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/63da3d7ae697e5898cb86854/H-vpXOX6KZu01HnV87Jk5.jpeg" width="320" height="320" />
|
107 |
+
<h1>Enhancing Human-Like Responses in Large Language Models</h1>
|
108 |
+
</div>
|
109 |
|
110 |
+
<p align="center">
|
111 |
+
   | 🤗 <a href="https://huggingface.co/collections/HumanLLMs/human-like-humanish-llms-6759fa68f22e11eb1a10967e">Models</a>   |
|
112 |
+
   📊 <a href="https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset">Dataset</a>   |
|
113 |
+
   📄<a href="https://arxiv.org/abs/2501.05032">Paper</a>   |
|
114 |
+
</p>
|
115 |
+
|
116 |
+
# 🚀 Human-Like-Llama3-8B-Instruct
|
117 |
+
|
118 |
+
This model is a fine-tuned version of [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407), specifically optimized to generate more human-like and conversational responses.
|
119 |
+
|
120 |
+
The fine-tuning process employed both [Low-Rank Adaptation (LoRA)](https://arxiv.org/abs/2501.05032) and [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) to enhance natural language understanding, conversational coherence, and emotional intelligence in interactions.
|
121 |
+
|
122 |
+
The proccess of creating this models is detailed in the research paper [“Enhancing Human-Like Responses in Large Language Models”](https://arxiv.org/abs/2501.05032).
|
123 |
+
|
124 |
+
# 🛠️ Training Configuration
|
125 |
+
|
126 |
+
- **Base Model:** Mistral-Nemo-Instruct-2407
|
127 |
+
- **Framework:** Axolotl v0.4.1
|
128 |
+
- **Hardware:** 2x NVIDIA A100 (80 GB) GPUs
|
129 |
+
- **Training Time:** ~3 hours 40 minutes
|
130 |
+
- **Dataset:** Synthetic dataset with ≈11,000 samples across 256 diverse topics
|
131 |
|
|
|
132 |
<details><summary>See axolotl config</summary>
|
133 |
|
134 |
axolotl version: `0.4.1`
|
|
|
213 |
|
214 |
</details><br>
|
215 |
|
216 |
+
# 💬 Prompt Template
|
217 |
|
218 |
+
You can use Mistral-Nemo prompt template while using the model:
|
219 |
|
220 |
+
### Mistral-Nemo
|
221 |
|
222 |
+
```
|
223 |
+
<s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]
|
224 |
+
```
|
225 |
|
226 |
+
This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
|
227 |
+
`tokenizer.apply_chat_template()` method:
|
228 |
|
229 |
+
```python
|
230 |
+
messages = [
|
231 |
+
{"role": "system", "content": "You are helpful AI asistant."},
|
232 |
+
{"role": "user", "content": "Hello!"}
|
233 |
+
]
|
234 |
+
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
|
235 |
+
model.generate(**gen_input)
|
236 |
+
```
|
237 |
|
238 |
+
# 🤖 Models
|
239 |
|
240 |
+
| Model | Download |
|
241 |
+
|:---------------------:|:-----------------------------------------------------------------------:|
|
242 |
+
| Human-Like-Llama-3-8B-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-LLama3-8B-Instruct) |
|
243 |
+
| Human-Like-Qwen-2.5-7B-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Qwen2.5-7B-Instruct) |
|
244 |
+
| Human-Like-Mistral-Nemo-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407) |
|
245 |
|
246 |
+
# 🎯 Benchmark Results
|
247 |
|
248 |
+
| **Group** | **Model** | **Average** | **IFEval** | **BBH** | **MATH Lvl 5** | **GPQA** | **MuSR** | **MMLU-PRO** |
|
249 |
+
|--------------------------------|--------------------------------|-------------|------------|---------|----------------|----------|----------|--------------|
|
250 |
+
| **Llama Models** | Human-Like-Llama-3-8B-Instruct | 22.37 | **64.97** | 28.01 | 8.45 | 0.78 | **2.00** | 30.01 |
|
251 |
+
| | Llama-3-8B-Instruct | 23.57 | 74.08 | 28.24 | 8.68 | 1.23 | 1.60 | 29.60 |
|
252 |
+
| | *Difference (Human-Like)* | -1.20 | **-9.11** | -0.23 | -0.23 | -0.45 | +0.40 | +0.41 |
|
253 |
+
| **Qwen Models** | Human-Like-Qwen-2.5-7B-Instruct | 26.66 | 72.84 | 34.48 | 0.00 | 6.49 | 8.42 | 37.76 |
|
254 |
+
| | Qwen-2.5-7B-Instruct | 26.86 | 75.85 | 34.89 | 0.00 | 5.48 | 8.45 | 36.52 |
|
255 |
+
| | *Difference (Human-Like)* | -0.20 | -3.01 | -0.41 | 0.00 | **+1.01**| -0.03 | **+1.24** |
|
256 |
+
| **Mistral Models** | Human-Like-Mistral-Nemo-Instruct | 22.88 | **54.51** | 32.70 | 7.62 | 5.03 | 9.39 | 28.00 |
|
257 |
+
| | Mistral-Nemo-Instruct | 23.53 | 63.80 | 29.68 | 5.89 | 5.37 | 8.48 | 27.97 |
|
258 |
+
| | *Difference (Human-Like)* | -0.65 | **-9.29** | **+3.02**| **+1.73** | -0.34 | +0.91 | +0.03 |
|
259 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
260 |
|
261 |
+
# 📊 Dataset
|
262 |
|
263 |
+
The dataset used for fine-tuning was generated using LLaMA 3 models. The dataset includes 10,884 samples across 256 distinct topics such as technology, daily life, science, history, and arts. Each sample consists of:
|
264 |
|
265 |
+
- **Human-like responses:** Natural, conversational answers mimicking human dialogue.
|
266 |
+
- **Formal responses:** Structured and precise answers with a more formal tone.
|
267 |
|
268 |
+
The dataset has been open-sourced and is available at:
|
269 |
|
270 |
+
- 👉 [Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset)
|
|
|
|
|
|
|
|
|
|
|
|
|
271 |
|
272 |
+
More details on the dataset creation process can be found in the accompanying research paper.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
273 |
|
274 |
+
# 📝 Citation
|
275 |
+
|
276 |
+
```
|
277 |
+
@misc{çalık2025enhancinghumanlikeresponseslarge,
|
278 |
+
title={Enhancing Human-Like Responses in Large Language Models},
|
279 |
+
author={Ethem Yağız Çalık and Talha Rüzgar Akkuş},
|
280 |
+
year={2025},
|
281 |
+
eprint={2501.05032},
|
282 |
+
archivePrefix={arXiv},
|
283 |
+
primaryClass={cs.CL},
|
284 |
+
url={https://arxiv.org/abs/2501.05032},
|
285 |
+
}
|
286 |
+
```
|