|
--- |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- gguf |
|
base_model: unsloth/phi-4-unsloth-bnb-4bit |
|
datasets: |
|
- bespokelabs/Bespoke-Stratos-17k |
|
- bespokelabs/Bespoke-Stratos-35k |
|
- NovaSky-AI/Sky-T1_data_17k |
|
- Quazim0t0/BenfordsLawReasoningJSON |
|
- open-thoughts/OpenThoughts-114k |
|
model-index: |
|
- name: Phi4.Turn.R1Distill_v1.5.1-Tensors |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: IFEval (0-Shot) |
|
type: HuggingFaceH4/ifeval |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: inst_level_strict_acc and prompt_level_strict_acc |
|
value: 29.95 |
|
name: strict accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Quazim0t0/Phi4.Turn.R1Distill_v1.5.1-Tensors |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: BBH (3-Shot) |
|
type: BBH |
|
args: |
|
num_few_shot: 3 |
|
metrics: |
|
- type: acc_norm |
|
value: 49.22 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Quazim0t0/Phi4.Turn.R1Distill_v1.5.1-Tensors |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MATH Lvl 5 (4-Shot) |
|
type: hendrycks/competition_math |
|
args: |
|
num_few_shot: 4 |
|
metrics: |
|
- type: exact_match |
|
value: 1.59 |
|
name: exact match |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Quazim0t0/Phi4.Turn.R1Distill_v1.5.1-Tensors |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GPQA (0-shot) |
|
type: Idavidrein/gpqa |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 2.46 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Quazim0t0/Phi4.Turn.R1Distill_v1.5.1-Tensors |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MuSR (0-shot) |
|
type: TAUR-Lab/MuSR |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 7.04 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Quazim0t0/Phi4.Turn.R1Distill_v1.5.1-Tensors |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU-PRO (5-shot) |
|
type: TIGER-Lab/MMLU-Pro |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 45.75 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Quazim0t0/Phi4.Turn.R1Distill_v1.5.1-Tensors |
|
name: Open LLM Leaderboard |
|
--- |
|
|
|
# TurnPhi Project |
|
|
|
- **Developed by:** Quazim0t0 |
|
- **Finetuned from model :** unsloth/phi-4-unsloth-bnb-4bit |
|
- **GGUF** |
|
- **Trained for 8 Hours on A800 with the Bespoke Stratos 17k Dataset.** |
|
- **Trained for 6 Hours on A800 with the Bespoke Stratos 35k Dataset.** |
|
- **Trained for 2 Hours on A800 with the Benford's Law Reasoning Small 430 Row Dataset, ensuring no overfitting.** |
|
- **Trained for 4 Hours on A800 with the Sky-T1_data_17k Dataset** |
|
- **Trained for 6 Hours on A800 with the Openthoughts 114k Dataset.** |
|
- **18$ Training...I'm actually amazed by the results.** |
|
|
|
# OpenWeb UI Function |
|
If using this model for Open WebUI here is a simple function to organize the models responses: https://openwebui.com/f/quaz93/phi4_turn_r1_distill_thought_function_v1 |
|
|
|
# Phi4 Turn R1Distill LoRA Adapters |
|
|
|
## Overview |
|
These **LoRA adapters** were trained using diverse **reasoning datasets** that incorporate structured **Thought** and **Solution** responses to enhance logical inference. This project was designed to **test the R1 dataset** on **Phi-4**, aiming to create a **lightweight, fast, and efficient reasoning model**. |
|
|
|
All adapters were fine-tuned using an **NVIDIA A800 GPU**, ensuring high performance and compatibility for continued training, merging, or direct deployment. |
|
As part of an open-source initiative, all resources are made **publicly available** for unrestricted research and development. |
|
|
|
--- |
|
|
|
## LoRA Adapters |
|
Below are the currently available LoRA fine-tuned adapters (**as of January 30, 2025**): |
|
|
|
- [Phi4.Turn.R1Distill-Lora1](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora1) |
|
- [Phi4.Turn.R1Distill-Lora2](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora2) |
|
- [Phi4.Turn.R1Distill-Lora3](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora3) |
|
- [Phi4.Turn.R1Distill-Lora4](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora4) |
|
- [Phi4.Turn.R1Distill-Lora5](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora5) |
|
- [Phi4.Turn.R1Distill-Lora6](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora6) |
|
- [Phi4.Turn.R1Distill-Lora7](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora7) |
|
- [Phi4.Turn.R1Distill-Lora8](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora8) |
|
|
|
--- |
|
|
|
## GGUF Full & Quantized Models |
|
To facilitate broader testing and real-world inference, **GGUF Full and Quantized versions** have been provided for evaluation on **Open WebUI** and other LLM interfaces. |
|
|
|
### **Version 1** |
|
- [Phi4.Turn.R1Distill.Q8_0](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill.Q8_0) |
|
- [Phi4.Turn.R1Distill.Q4_k](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill.Q4_k) |
|
- [Phi4.Turn.R1Distill.16bit](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill.16bit) |
|
|
|
### **Version 1.1** |
|
- [Phi4.Turn.R1Distill_v1.1_Q4_k](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill_v1.1_Q4_k) |
|
|
|
### **Version 1.2** |
|
- [Phi4.Turn.R1Distill_v1.2_Q4_k](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill_v1.2_Q4_k) |
|
|
|
### **Version 1.3** |
|
- [Phi4.Turn.R1Distill_v1.3_Q4_k-GGUF](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill_v1.3_Q4_k-GGUF) |
|
|
|
### **Version 1.4** |
|
- [Phi4.Turn.R1Distill_v1.4_Q4_k-GGUF](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill_v1.4_Q4_k-GGUF) |
|
|
|
### **Version 1.5** |
|
- [Phi4.Turn.R1Distill_v1.5_Q4_k-GGUF](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill_v1.5_Q4_k-GGUF) |
|
|
|
--- |
|
|
|
## Usage |
|
|
|
### **Loading LoRA Adapters with `transformers` and `peft`** |
|
To load and apply the LoRA adapters on Phi-4, use the following approach: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from peft import PeftModel |
|
|
|
base_model = "microsoft/Phi-4" |
|
lora_adapter = "Quazim0t0/Phi4.Turn.R1Distill-Lora1" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model) |
|
model = AutoModelForCausalLM.from_pretrained(base_model) |
|
model = PeftModel.from_pretrained(model, lora_adapter) |
|
|
|
model.eval() |
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Quazim0t0__Phi4.Turn.R1Distill_v1.5.1-Tensors-details) |
|
|
|
| Metric |Value| |
|
|-------------------|----:| |
|
|Avg. |22.67| |
|
|IFEval (0-Shot) |29.95| |
|
|BBH (3-Shot) |49.22| |
|
|MATH Lvl 5 (4-Shot)| 1.59| |
|
|GPQA (0-shot) | 2.46| |
|
|MuSR (0-shot) | 7.04| |
|
|MMLU-PRO (5-shot) |45.75| |
|
|
|
|