Model card for Mistral-Instruct-Ukrainian-SFT

Supervised finetuning of Mistral-7B-Instruct-v0.2 on Ukrainian datasets.

Instruction format

In order to leverage instruction fine-tuning, your prompt should be surrounded by [INST] and [/INST] tokens.

E.g.

text = "[INST]Відповідайте лише буквою правильної відповіді: Елементи експресіонізму наявні у творі: A. «Камінний хрест», B. «Інститутка», C. «Маруся», D. «Людина»[/INST]"

This format is available as a chat template via the apply_chat_template() method:

Model Architecture

This instruction model is based on Mistral-7B-v0.2, a transformer model with the following architecture choices:

  • Grouped-Query Attention
  • Sliding-Window Attention
  • Byte-fallback BPE tokenizer

Datasets

💻 Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "Radu1999/Mistral-Instruct-Ukrainian-SFT"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Author

Radu Chivereanu

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 62.17
AI2 Reasoning Challenge (25-Shot) 57.85
HellaSwag (10-Shot) 83.12
MMLU (5-Shot) 60.95
TruthfulQA (0-shot) 54.14
Winogrande (5-shot) 77.51
GSM8k (5-shot) 39.42
Downloads last month
91
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Radu1999/Mistral-Instruct-Ukrainian-SFT

Quantizations
2 models

Spaces using Radu1999/Mistral-Instruct-Ukrainian-SFT 5

Evaluation results