SebastianSchramm's picture
fix eval score table
09f1ec7
|
raw
history blame
2.42 kB
metadata
language:
  - en
pipeline_tag: text-generation
library_name: transformers
tags:
  - cerebras
  - LLM
inference: false

Instruction-tuned Cerebras GPT 111M

The smallest of cerebras GPT models with only 111M parameters instruction fine-tuned.

Model Description

Instruction fine-tuned cerebras-GPT-111M

Evaluation

The model has been evaluated with Huggingface's Open LLM leaderboard. Have a look at the leaderboard for more details: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard The performance of the instruction fine-tuned model does improve compared to the cerebras base model by about 5.7% (average score):

Model Average ARC (25-shot) HellaSwag (10-shot) MMLU (5-shot) TruthfulQA (0-shot)
SebastianSchramm/Cerebras-GPT-111M-instruction 31.6 24.3 26.2 26.5 49.5
cerebras/Cerebras-GPT-111M 29.9 20 26.7 26.7 46.3

Training data

The model was fine-tuned with the following data: alpaca_gpt4_data (data generated by GPT-4 using Alpaca prompts for fine-tuning LLMs) and alpaca_data_cleaned.

Prompt template

Fine-tuning was performed with the promp template from stanford alpaca:

PROMPT_DICT = {
    "prompt_input": (
        "Below is an instruction that describes a task, paired with an input that provides further context. "
        "Write a response that appropriately completes the request.\n\n"
        "### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:"
    ),
    "prompt_no_input": (
        "Below is an instruction that describes a task. "
        "Write a response that appropriately completes the request.\n\n"
        "### Instruction:\n{instruction}\n\n### Response:"
    ),
}

Usage

It is recommended to format input according to the prompt template mentioned above during inference for best results.