Q-bert commited on
Commit
484b410
·
verified ·
1 Parent(s): 1112d61

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +79 -45
README.md CHANGED
@@ -102,11 +102,33 @@ model-index:
102
  url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Mistral-Nemo-Instruct-2407
103
  name: Open LLM Leaderboard
104
  ---
 
 
 
 
105
 
106
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
107
- should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
108
 
109
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
110
  <details><summary>See axolotl config</summary>
111
 
112
  axolotl version: `0.4.1`
@@ -191,62 +213,74 @@ save_safetensors: true
191
 
192
  </details><br>
193
 
194
- # Humanish-Mistral-Nemo-Instruct-2407
195
 
196
- This model is a fine-tuned version of [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) on an unknown dataset.
197
 
198
- ## Model description
199
 
200
- More information needed
 
 
201
 
202
- ## Intended uses & limitations
 
203
 
204
- More information needed
 
 
 
 
 
 
 
205
 
206
- ## Training and evaluation data
207
 
208
- More information needed
 
 
 
 
209
 
210
- ## Training procedure
211
 
212
- ### Training hyperparameters
 
 
 
 
 
 
 
 
 
 
213
 
214
- The following hyperparameters were used during training:
215
- - learning_rate: 0.0002
216
- - train_batch_size: 2
217
- - eval_batch_size: 8
218
- - seed: 42
219
- - distributed_type: multi-GPU
220
- - num_devices: 2
221
- - gradient_accumulation_steps: 8
222
- - total_train_batch_size: 32
223
- - total_eval_batch_size: 16
224
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
225
- - lr_scheduler_type: cosine
226
- - lr_scheduler_warmup_steps: 10
227
- - training_steps: 341
228
 
229
- ### Training results
230
 
 
231
 
 
 
232
 
233
- ### Framework versions
234
 
235
- - PEFT 0.13.0
236
- - Transformers 4.45.1
237
- - Pytorch 2.3.1+cu121
238
- - Datasets 2.21.0
239
- - Tokenizers 0.20.0
240
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
241
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_HumanLLMs__Humanish-Mistral-Nemo-Instruct-2407)
242
 
243
- | Metric |Value|
244
- |-------------------|----:|
245
- |Avg. |22.88|
246
- |IFEval (0-Shot) |54.51|
247
- |BBH (3-Shot) |32.71|
248
- |MATH Lvl 5 (4-Shot)| 7.63|
249
- |GPQA (0-shot) | 5.03|
250
- |MuSR (0-shot) | 9.40|
251
- |MMLU-PRO (5-shot) |28.01|
252
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
  url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Mistral-Nemo-Instruct-2407
103
  name: Open LLM Leaderboard
104
  ---
105
+ <div align="center">
106
+ <img src="https://cdn-avatars.huggingface.co/v1/production/uploads/63da3d7ae697e5898cb86854/H-vpXOX6KZu01HnV87Jk5.jpeg" width="320" height="320" />
107
+ <h1>Enhancing Human-Like Responses in Large Language Models</h1>
108
+ </div>
109
 
110
+ <p align="center">
111
+ &nbsp&nbsp | 🤗 <a href="https://huggingface.co/collections/HumanLLMs/human-like-humanish-llms-6759fa68f22e11eb1a10967e">Models</a>&nbsp&nbsp |
112
+ &nbsp&nbsp 📊 <a href="https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset">Dataset</a>&nbsp&nbsp |
113
+ &nbsp&nbsp 📄<a href="https://arxiv.org/abs/2501.05032">Paper</a>&nbsp&nbsp |
114
+ </p>
115
+
116
+ # 🚀 Human-Like-Llama3-8B-Instruct
117
+
118
+ This model is a fine-tuned version of [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407), specifically optimized to generate more human-like and conversational responses.
119
+
120
+ The fine-tuning process employed both [Low-Rank Adaptation (LoRA)](https://arxiv.org/abs/2501.05032) and [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) to enhance natural language understanding, conversational coherence, and emotional intelligence in interactions.
121
+
122
+ The proccess of creating this models is detailed in the research paper [“Enhancing Human-Like Responses in Large Language Models”](https://arxiv.org/abs/2501.05032).
123
+
124
+ # 🛠️ Training Configuration
125
+
126
+ - **Base Model:** Mistral-Nemo-Instruct-2407
127
+ - **Framework:** Axolotl v0.4.1
128
+ - **Hardware:** 2x NVIDIA A100 (80 GB) GPUs
129
+ - **Training Time:** ~3 hours 40 minutes
130
+ - **Dataset:** Synthetic dataset with ≈11,000 samples across 256 diverse topics
131
 
 
132
  <details><summary>See axolotl config</summary>
133
 
134
  axolotl version: `0.4.1`
 
213
 
214
  </details><br>
215
 
216
+ # 💬 Prompt Template
217
 
218
+ You can use Mistral-Nemo prompt template while using the model:
219
 
220
+ ### Mistral-Nemo
221
 
222
+ ```
223
+ <s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]
224
+ ```
225
 
226
+ This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
227
+ `tokenizer.apply_chat_template()` method:
228
 
229
+ ```python
230
+ messages = [
231
+ {"role": "system", "content": "You are helpful AI asistant."},
232
+ {"role": "user", "content": "Hello!"}
233
+ ]
234
+ gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
235
+ model.generate(**gen_input)
236
+ ```
237
 
238
+ # 🤖 Models
239
 
240
+ | Model | Download |
241
+ |:---------------------:|:-----------------------------------------------------------------------:|
242
+ | Human-Like-Llama-3-8B-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-LLama3-8B-Instruct) |
243
+ | Human-Like-Qwen-2.5-7B-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Qwen2.5-7B-Instruct) |
244
+ | Human-Like-Mistral-Nemo-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407) |
245
 
246
+ # 🎯 Benchmark Results
247
 
248
+ | **Group** | **Model** | **Average** | **IFEval** | **BBH** | **MATH Lvl 5** | **GPQA** | **MuSR** | **MMLU-PRO** |
249
+ |--------------------------------|--------------------------------|-------------|------------|---------|----------------|----------|----------|--------------|
250
+ | **Llama Models** | Human-Like-Llama-3-8B-Instruct | 22.37 | **64.97** | 28.01 | 8.45 | 0.78 | **2.00** | 30.01 |
251
+ | | Llama-3-8B-Instruct | 23.57 | 74.08 | 28.24 | 8.68 | 1.23 | 1.60 | 29.60 |
252
+ | | *Difference (Human-Like)* | -1.20 | **-9.11** | -0.23 | -0.23 | -0.45 | +0.40 | +0.41 |
253
+ | **Qwen Models** | Human-Like-Qwen-2.5-7B-Instruct | 26.66 | 72.84 | 34.48 | 0.00 | 6.49 | 8.42 | 37.76 |
254
+ | | Qwen-2.5-7B-Instruct | 26.86 | 75.85 | 34.89 | 0.00 | 5.48 | 8.45 | 36.52 |
255
+ | | *Difference (Human-Like)* | -0.20 | -3.01 | -0.41 | 0.00 | **+1.01**| -0.03 | **+1.24** |
256
+ | **Mistral Models** | Human-Like-Mistral-Nemo-Instruct | 22.88 | **54.51** | 32.70 | 7.62 | 5.03 | 9.39 | 28.00 |
257
+ | | Mistral-Nemo-Instruct | 23.53 | 63.80 | 29.68 | 5.89 | 5.37 | 8.48 | 27.97 |
258
+ | | *Difference (Human-Like)* | -0.65 | **-9.29** | **+3.02**| **+1.73** | -0.34 | +0.91 | +0.03 |
259
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
260
 
261
+ # 📊 Dataset
262
 
263
+ The dataset used for fine-tuning was generated using LLaMA 3 models. The dataset includes 10,884 samples across 256 distinct topics such as technology, daily life, science, history, and arts. Each sample consists of:
264
 
265
+ - **Human-like responses:** Natural, conversational answers mimicking human dialogue.
266
+ - **Formal responses:** Structured and precise answers with a more formal tone.
267
 
268
+ The dataset has been open-sourced and is available at:
269
 
270
+ - 👉 [Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset)
 
 
 
 
 
 
271
 
272
+ More details on the dataset creation process can be found in the accompanying research paper.
 
 
 
 
 
 
 
 
273
 
274
+ # 📝 Citation
275
+
276
+ ```
277
+ @misc{çalık2025enhancinghumanlikeresponseslarge,
278
+ title={Enhancing Human-Like Responses in Large Language Models},
279
+ author={Ethem Yağız Çalık and Talha Rüzgar Akkuş},
280
+ year={2025},
281
+ eprint={2501.05032},
282
+ archivePrefix={arXiv},
283
+ primaryClass={cs.CL},
284
+ url={https://arxiv.org/abs/2501.05032},
285
+ }
286
+ ```