Mistral-7B-v0.2-OpenHermes

image/webp

SFT Training Params:

  • Learning Rate: 2e-4
  • Batch Size: 8
  • Gradient Accumulation steps: 4
  • Dataset: teknium/OpenHermes-2.5 (200k split contains a slight bias towards rp and theory of life)
  • r: 16
  • Lora Alpha: 16

Training Time: 13 hours on A100

This model is proficient in RAG use cases

RAG Finetuning for your case would be a good idea

Prompt Template: ChatML

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
What's the capital of France?<|im_end|>
<|im_start|>assistant
Paris.

Run easily with ollama

ollama run macadeliccc/mistral-7b-v2-openhermes

OpenAI compatible server with vLLM

install instructions for vllm can be found here

python -m vllm.entrypoints.openai.api_server \
--model macadeliccc/Mistral-7B-v0.2-OpenHermes \ 
--gpu-memory-utilization 0.9 \ # can go as low as 0.83-0.85 if you need a little more gpu for your application
--max-model-len 16000 # 32000 if you can run it. This works on 4090
--chat-template ./examples/template_chatml.jinja

Gradio chatbot interface for your endpoint

import gradio as gr
from openai import OpenAI

# Modify these variables as needed
openai_api_key = "EMPTY"  # Assuming no API key is required for local testing
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)
system_message = "You are a helpful assistant"

def fast_echo(message, history):
    # Send the user's message to the vLLM API and get the response immediately
   
    chat_response = client.chat.completions.create(
        model="macadeliccc/Mistral-7B-v0.2-OpenHermes",
        messages=[
            {"role": "system", "content": system_message},
            {"role": "user", "content": message},
        ]
    )
    print(chat_response)
    return chat_response.choices[0].message.content

demo = gr.ChatInterface(fn=fast_echo, examples=["Write me a quicksort algorithm in python."]).queue()

if __name__ == "__main__":
    demo.launch()

Quantizations

GGUF

AWQ

HQQ-4bit

ExLlamaV2

Evaluations

Thanks to Maxime Labonne for the evalution:

Model AGIEval GPT4All TruthfulQA Bigbench Average
Mistral-7B-v0.2-OpenHermes 35.57 67.15 42.06 36.27 45.26

AGIEval

Task Version Metric Value Stderr
agieval_aqua_rat 0 acc 24.02 ± 2.69
acc_norm 21.65 ± 2.59
agieval_logiqa_en 0 acc 28.11 ± 1.76
acc_norm 34.56 ± 1.87
agieval_lsat_ar 0 acc 27.83 ± 2.96
acc_norm 23.48 ± 2.80
agieval_lsat_lr 0 acc 33.73 ± 2.10
acc_norm 33.14 ± 2.09
agieval_lsat_rc 0 acc 48.70 ± 3.05
acc_norm 39.78 ± 2.99
agieval_sat_en 0 acc 67.48 ± 3.27
acc_norm 64.56 ± 3.34
agieval_sat_en_without_passage 0 acc 38.83 ± 3.40
acc_norm 37.38 ± 3.38
agieval_sat_math 0 acc 32.27 ± 3.16
acc_norm 30.00 ± 3.10

Average: 35.57%

GPT4All

Task Version Metric Value Stderr
arc_challenge 0 acc 45.05 ± 1.45
acc_norm 48.46 ± 1.46
arc_easy 0 acc 77.27 ± 0.86
acc_norm 73.78 ± 0.90
boolq 1 acc 68.62 ± 0.81
hellaswag 0 acc 59.63 ± 0.49
acc_norm 79.66 ± 0.40
openbookqa 0 acc 31.40 ± 2.08
acc_norm 43.40 ± 2.22
piqa 0 acc 80.25 ± 0.93
acc_norm 82.05 ± 0.90
winogrande 0 acc 74.11 ± 1.23

Average: 67.15%

TruthfulQA

Task Version Metric Value Stderr
truthfulqa_mc 1 mc1 27.54 ± 1.56
mc2 42.06 ± 1.44

Average: 42.06%

Bigbench

Task Version Metric Value Stderr
bigbench_causal_judgement 0 multiple_choice_grade 56.32 ± 3.61
bigbench_date_understanding 0 multiple_choice_grade 66.40 ± 2.46
bigbench_disambiguation_qa 0 multiple_choice_grade 45.74 ± 3.11
bigbench_geometric_shapes 0 multiple_choice_grade 10.58 ± 1.63
exact_str_match 0.00 ± 0.00
bigbench_logical_deduction_five_objects 0 multiple_choice_grade 25.00 ± 1.94
bigbench_logical_deduction_seven_objects 0 multiple_choice_grade 17.71 ± 1.44
bigbench_logical_deduction_three_objects 0 multiple_choice_grade 37.33 ± 2.80
bigbench_movie_recommendation 0 multiple_choice_grade 29.40 ± 2.04
bigbench_navigate 0 multiple_choice_grade 50.00 ± 1.58
bigbench_reasoning_about_colored_objects 0 multiple_choice_grade 42.50 ± 1.11
bigbench_ruin_names 0 multiple_choice_grade 39.06 ± 2.31
bigbench_salient_translation_error_detection 0 multiple_choice_grade 12.93 ± 1.06
bigbench_snarks 0 multiple_choice_grade 69.06 ± 3.45
bigbench_sports_understanding 0 multiple_choice_grade 49.80 ± 1.59
bigbench_temporal_sequences 0 multiple_choice_grade 26.50 ± 1.40
bigbench_tracking_shuffled_objects_five_objects 0 multiple_choice_grade 21.20 ± 1.16
bigbench_tracking_shuffled_objects_seven_objects 0 multiple_choice_grade 16.06 ± 0.88
bigbench_tracking_shuffled_objects_three_objects 0 multiple_choice_grade 37.33 ± 2.80

Average: 36.27%

Average score: 45.26%

Elapsed time: 01:49:22

  • Developed by: macadeliccc
  • License: apache-2.0
  • Finetuned from model : alpindale/Mistral-7B-v0.2

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
22
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for macadeliccc/Mistral-7B-v0.2-OpenHermes

Merges
1 model
Quantizations
3 models