|
--- |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
tags: |
|
- cerebras |
|
- LLM |
|
inference: false |
|
--- |
|
|
|
# Instruction-tuned Cerebras GPT 111M |
|
|
|
The smallest of [cerebras GPT models](https://huggingface.co/cerebras) with only 111M parameters instruction fine-tuned. |
|
|
|
## Model Description |
|
|
|
Instruction fine-tuned [cerebras-GPT-111M](https://huggingface.co/cerebras/Cerebras-GPT-111M) |
|
|
|
## Evaluation |
|
|
|
The model has been evaluated with Huggingface's Open LLM leaderboard. Have a look at the leaderboard for more details: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard |
|
The performance of the instruction fine-tuned model does improve compared to the cerebras base model by about 5.7% (average score): |
|
|
|
Model | Average | ARC (25-shot) | HellaSwag (10-shot) | MMLU (5-shot) | TruthfulQA (0-shot) |
|
--- | --- | --- | --- | --- | --- |
|
SebastianSchramm/Cerebras-GPT-111M-instruction | 31.6 | 24.3 | 26.2 | 26.5 | 49.5 |
|
cerebras/Cerebras-GPT-111M | 29.9 | 20 | 26.7 | 26.7 | 46.3 |
|
|||||| |
|
|
|
## Training data |
|
|
|
The model was fine-tuned with the following data: [alpaca_gpt4_data](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/blob/main/data/alpaca_gpt4_data.json) (data generated by GPT-4 using Alpaca prompts for fine-tuning LLMs) and [alpaca_data_cleaned](https://github.com/tloen/alpaca-lora/blob/a3027fea37c2087b8b0131b21a4cd948bbdcd9e0/alpaca_data_cleaned.json). |
|
|
|
## Prompt template |
|
|
|
Fine-tuning was performed with the promp template from [stanford alpaca](https://github.com/tatsu-lab/stanford_alpaca): |
|
|
|
```python |
|
PROMPT_DICT = { |
|
"prompt_input": ( |
|
"Below is an instruction that describes a task, paired with an input that provides further context. " |
|
"Write a response that appropriately completes the request.\n\n" |
|
"### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:" |
|
), |
|
"prompt_no_input": ( |
|
"Below is an instruction that describes a task. " |
|
"Write a response that appropriately completes the request.\n\n" |
|
"### Instruction:\n{instruction}\n\n### Response:" |
|
), |
|
} |
|
``` |
|
|
|
## Usage |
|
|
|
It is recommended to format input according to the prompt template mentioned above during inference for best results. |
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_SebastianSchramm__Cerebras-GPT-111M-instruction) |
|
|
|
| Metric | Value | |
|
|-----------------------|---------------------------| |
|
| Avg. | 25.37 | |
|
| ARC (25-shot) | 24.4 | |
|
| HellaSwag (10-shot) | 26.05 | |
|
| MMLU (5-shot) | 25.87 | |
|
| TruthfulQA (0-shot) | 49.46 | |
|
| Winogrande (5-shot) | 51.62 | |
|
| GSM8K (5-shot) | 0.0 | |
|
| DROP (3-shot) | 0.17 | |
|
|