HumanLLMs
/

Human-Like-Mistral-Nemo-Instruct-2407

@@ -102,11 +102,33 @@ model-index:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Mistral-Nemo-Instruct-2407
       name: Open LLM Leaderboard
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
 <details><summary>See axolotl config</summary>
 axolotl version: `0.4.1`
@@ -191,62 +213,74 @@ save_safetensors: true
 </details><br>
-# Humanish-Mistral-Nemo-Instruct-2407
-This model is a fine-tuned version of [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) on an unknown dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 0.0002
-- train_batch_size: 2
-- eval_batch_size: 8
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 2
-- gradient_accumulation_steps: 8
-- total_train_batch_size: 32
-- total_eval_batch_size: 16
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 10
-- training_steps: 341
-### Training results
-### Framework versions
-- PEFT 0.13.0
-- Transformers 4.45.1
-- Pytorch 2.3.1+cu121
-- Datasets 2.21.0
-- Tokenizers 0.20.0
-# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
-Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_HumanLLMs__Humanish-Mistral-Nemo-Instruct-2407)
-|      Metric       |Value|
-|-------------------|----:|
-|Avg.               |22.88|
-|IFEval (0-Shot)    |54.51|
-|BBH (3-Shot)       |32.71|
-|MATH Lvl 5 (4-Shot)| 7.63|
-|GPQA (0-shot)      | 5.03|
-|MuSR (0-shot)      | 9.40|
-|MMLU-PRO (5-shot)  |28.01|

       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Mistral-Nemo-Instruct-2407
       name: Open LLM Leaderboard
 ---
+<div align="center">
+  <img src="https://cdn-avatars.huggingface.co/v1/production/uploads/63da3d7ae697e5898cb86854/H-vpXOX6KZu01HnV87Jk5.jpeg" width="320" height="320" />
+  <h1>Enhancing Human-Like Responses in Large Language Models</h1>
+</div>
+<p align="center">
+  &nbsp&nbsp | 🤗 <a href="https://huggingface.co/collections/HumanLLMs/human-like-humanish-llms-6759fa68f22e11eb1a10967e">Models</a>&nbsp&nbsp |
+  &nbsp&nbsp 📊 <a href="https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset">Dataset</a>&nbsp&nbsp |
+  &nbsp&nbsp 📄<a href="https://arxiv.org/abs/2501.05032">Paper</a>&nbsp&nbsp |
+</p>
+# 🚀 Human-Like-Llama3-8B-Instruct
+This model is a fine-tuned version of [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407), specifically optimized to generate more human-like and conversational responses.
+The fine-tuning process employed both [Low-Rank Adaptation (LoRA)](https://arxiv.org/abs/2501.05032) and [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) to enhance natural language understanding, conversational coherence, and emotional intelligence in interactions.
+The proccess of creating this models is detailed in the research paper [“Enhancing Human-Like Responses in Large Language Models”](https://arxiv.org/abs/2501.05032).
+# 🛠️ Training Configuration
+- **Base Model:** Mistral-Nemo-Instruct-2407
+- **Framework:** Axolotl v0.4.1
+- **Hardware:** 2x NVIDIA A100 (80 GB) GPUs
+- **Training Time:** ~3 hours 40 minutes
+- **Dataset:** Synthetic dataset with ≈11,000 samples across 256 diverse topics
 <details><summary>See axolotl config</summary>
 axolotl version: `0.4.1`
 </details><br>
+# 💬 Prompt Template
+You can use Mistral-Nemo prompt template while using the model:
+### Mistral-Nemo
+```
+<s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]
+```
+This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
+`tokenizer.apply_chat_template()` method:
+```python
+messages = [
+    {"role": "system", "content": "You are helpful AI asistant."},
+    {"role": "user", "content": "Hello!"}
+]
+gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
+model.generate(**gen_input)
+```
+# 🤖 Models
+|         Model         |                               Download                                 |
+|:---------------------:|:-----------------------------------------------------------------------:|
+| Human-Like-Llama-3-8B-Instruct  |  🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-LLama3-8B-Instruct)  |
+| Human-Like-Qwen-2.5-7B-Instruct  | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Qwen2.5-7B-Instruct)  |
+| Human-Like-Mistral-Nemo-Instruct  | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407) |
+# 🎯 Benchmark Results
+| **Group**                      | **Model**                      | **Average** | **IFEval** | **BBH** | **MATH Lvl 5** | **GPQA** | **MuSR** | **MMLU-PRO** |
+|--------------------------------|--------------------------------|-------------|------------|---------|----------------|----------|----------|--------------|
+| **Llama Models**               | Human-Like-Llama-3-8B-Instruct | 22.37       | **64.97**  | 28.01   | 8.45           | 0.78     | **2.00** | 30.01        |
+|                                | Llama-3-8B-Instruct            | 23.57       | 74.08      | 28.24   | 8.68           | 1.23     | 1.60     | 29.60        |
+|                                | *Difference (Human-Like)*      | -1.20       | **-9.11**  | -0.23   | -0.23          | -0.45    | +0.40    | +0.41        |
+| **Qwen Models**                | Human-Like-Qwen-2.5-7B-Instruct | 26.66      | 72.84      | 34.48   | 0.00           | 6.49     | 8.42     | 37.76        |
+|                                | Qwen-2.5-7B-Instruct           | 26.86       | 75.85      | 34.89   | 0.00           | 5.48     | 8.45     | 36.52        |
+|                                | *Difference (Human-Like)*      | -0.20       | -3.01      | -0.41   | 0.00           | **+1.01**| -0.03    | **+1.24**    |
+| **Mistral Models**             | Human-Like-Mistral-Nemo-Instruct | 22.88     | **54.51**  | 32.70   | 7.62           | 5.03     | 9.39     | 28.00        |
+|                                | Mistral-Nemo-Instruct          | 23.53       | 63.80      | 29.68   | 5.89           | 5.37     | 8.48     | 27.97        |
+|                                | *Difference (Human-Like)*      | -0.65       | **-9.29**  | **+3.02**| **+1.73**      | -0.34    | +0.91    | +0.03        |
+# 📊 Dataset
+The dataset used for fine-tuning was generated using LLaMA 3 models. The dataset includes 10,884 samples across 256 distinct topics such as technology, daily life, science, history, and arts. Each sample consists of:
+- **Human-like responses:** Natural, conversational answers mimicking human dialogue.
+- **Formal responses:** Structured and precise answers with a more formal tone.
+The dataset has been open-sourced and is available at:
+- 👉 [Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset)
+More details on the dataset creation process can be found in the accompanying research paper.
+# 📝 Citation
+```
+@misc{çalık2025enhancinghumanlikeresponseslarge,
+      title={Enhancing Human-Like Responses in Large Language Models},
+      author={Ethem Yağız Çalık and Talha Rüzgar Akkuş},
+      year={2025},
+      eprint={2501.05032},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2501.05032},
+}
+```