---
language:
- ko
- en
pipeline_tag: text-generation
inference: false
tags:
- solar
- mistral
- pytorch
- solar-ko
library_name: transformers
license: apache-2.0
---

<img src="https://cdn-uploads.huggingface.co/production/uploads/5e56829137cb5b49818287ea/WuiaS45EAWDurGTOtjR_d.png" style="max-width:250px;margin:0 auto;" />

**Update Log**

- 2024.07.01: Released Solar-Ko-Recovery & Uploaded Benchmark scores
- 2024.05.16: Preview Released Solar-Ko-Recovery

# **Solar-Ko-Recovery-11B** 🌟❤️‍🩹

Solar-Ko-Recovery-11B aimed to recover Solar's capability on Korean with re-arrange of Embeddings and LM head, featuring an expanded vocabulary and the inclusion of a Korean+English corpus for enhanced representation. 

## Model Details

**Model Developers:** Junbum Lee (Beomi)

**Variations:** Solar-Ko-Recovery is available with one parameter sizes — 11B(10.99B🤣).

**Input:** The model accepts only text input.

**Output:** The model produces text output exclusively.

**Model Architecture:** 

Solar-Ko-Recovery is an auto-regressive language model that leverages an optimized transformer architecture derived from Llama-2.

| |Training Data|Parameters|Content Length|GQA|Tokens|Learning Rate|
|---|---|---|---|---|---|---|
|Solar-Ko-Recovery|*A curated mix of Korean+English Corpora*|11B(10.99B)|4k|O|>100B*|5e<sup>-5</sup>|

> NOTE: 2-step training processed
>
> 1) Only Embedding layer and LM Head layer are trained
> 2) Full params trained

**Vocab Expansion**

Vocab expansion is conducted on edited [upstage/solar-1-mini-tokenizer](https://huggingface.co/upstage/solar-1-mini-tokenizer), which is superset of Solar tokenizer.

| Model Name | Vocabulary Size | Description | 
| --- | --- | --- |
| Original Solar | 32000 | Sentencepiece BPE |
| **solar-1-mini-tokenizer** | 64000 | Sentencepiece BPE. Added Ko/JP vocabs |

**Tokenizing "안녕하세요, 오늘은 날씨가 좋네요."**

- SOLAR-10.7B: 26 tokens
- Solar-Ko-Recovery: 7 tokens

| Model | Tokens |
| --- | --- |
| SOLAR-10.7B | `['▁', '안', '<0xEB>', '<0x85>', '<0x95>', '하', '세', '요', ',', '▁', '오', '<0xEB>', '<0x8A>', '<0x98>', '은', '▁', '날', '<0xEC>', '<0x94>', '<0xA8>', '가', '▁', '좋', '네', '요', '.']` |
| Solar-Ko-Recovery | `['▁안녕하세요', ',', '▁오늘은', '▁날씨가', '▁좋', '네요', '.']` |

**Tokenizing "Meet 10.7B Solar: Elevating Performance with Upstage Depth UP Scaling!"**

- SOLAR-10.7B: 22 tokens
- Solar-Ko-Recovery: 22 tokens

| Model | Tokens |
| --- | --- |
| SOLAR-10.7B | `['▁Meet', '▁', '1', '0', '.', '7', 'B', '▁Solar', ':', '▁E', 'lev', 'ating', '▁Performance', '▁with', '▁Up', 'stage', '▁Dep', 'th', '▁UP', '▁Scal', 'ing', '!']` |
| Solar-Ko-Recovery | `['▁Meet', '▁', '1', '0', '.', '7', 'B', '▁Solar', ':', '▁E', 'lev', 'ating', '▁Performance', '▁with', '▁Up', 'stage', '▁Dep', 'th', '▁UP', '▁Scal', 'ing', '!']` |

# LICENSE

Apache 2.0

# **Model Benchmark**

## LM Eval Harness - Korean

- Used EleutherAI's [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
- 5-shot scores

|                          Tasks                           |  Metric   |  Value  |   |  Stderr |
|----------------------------------------------------------|-----------|--------:|---|--------:|
|haerae                                                    |acc_norm   | 0.7874  |±  | 0.0118  |
| - haerae_general_knowledge                               |acc        | 0.5000  |±  | 0.0378  |
| - haerae_history                                         |acc        | 0.8723  |±  | 0.0244  |
| - haerae_loan_word                                       |acc        | 0.8402  |±  | 0.0283  |
| - haerae_rare_word                                       |acc        | 0.8346  |±  | 0.0185  |
| - haerae_standard_nomenclature                           |acc        | 0.8301  |±  | 0.0305  |
|kmmlu_direct                                              |exact_match| 0.4205  |±  | 0.0026  |
| - kmmlu_direct_accounting                                |exact_match| 0.3700  |±  | 0.0485  |
| - kmmlu_direct_agricultural_sciences                     |exact_match| 0.3140  |±  | 0.0147  |
| - kmmlu_direct_aviation_engineering_and_maintenance      |exact_match| 0.3870  |±  | 0.0154  |
| - kmmlu_direct_biology                                   |exact_match| 0.3510  |±  | 0.0151  |
| - kmmlu_direct_chemical_engineering                      |exact_match| 0.3910  |±  | 0.0154  |
| - kmmlu_direct_chemistry                                 |exact_match| 0.4000  |±  | 0.0200  |
| - kmmlu_direct_civil_engineering                         |exact_match| 0.4010  |±  | 0.0155  |
| - kmmlu_direct_computer_science                          |exact_match| 0.6520  |±  | 0.0151  |
| - kmmlu_direct_construction                              |exact_match| 0.3080  |±  | 0.0146  |
| - kmmlu_direct_criminal_law                              |exact_match| 0.3100  |±  | 0.0328  |
| - kmmlu_direct_ecology                                   |exact_match| 0.4660  |±  | 0.0158  |
| - kmmlu_direct_economics                                 |exact_match| 0.5385  |±  | 0.0439  |
| - kmmlu_direct_education                                 |exact_match| 0.6200  |±  | 0.0488  |
| - kmmlu_direct_electrical_engineering                    |exact_match| 0.3000  |±  | 0.0145  |
| - kmmlu_direct_electronics_engineering                   |exact_match| 0.4740  |±  | 0.0158  |
| - kmmlu_direct_energy_management                         |exact_match| 0.3560  |±  | 0.0151  |
| - kmmlu_direct_environmental_science                     |exact_match| 0.2980  |±  | 0.0145  |
| - kmmlu_direct_fashion                                   |exact_match| 0.4470  |±  | 0.0157  |
| - kmmlu_direct_food_processing                           |exact_match| 0.3690  |±  | 0.0153  |
| - kmmlu_direct_gas_technology_and_engineering            |exact_match| 0.3000  |±  | 0.0145  |
| - kmmlu_direct_geomatics                                 |exact_match| 0.3820  |±  | 0.0154  |
| - kmmlu_direct_health                                    |exact_match| 0.5700  |±  | 0.0498  |
| - kmmlu_direct_industrial_engineer                       |exact_match| 0.3830  |±  | 0.0154  |
| - kmmlu_direct_information_technology                    |exact_match| 0.6090  |±  | 0.0154  |
| - kmmlu_direct_interior_architecture_and_design          |exact_match| 0.5440  |±  | 0.0158  |
| - kmmlu_direct_korean_history                            |exact_match| 0.3800  |±  | 0.0488  |
| - kmmlu_direct_law                                       |exact_match| 0.4670  |±  | 0.0158  |
| - kmmlu_direct_machine_design_and_manufacturing          |exact_match| 0.3960  |±  | 0.0155  |
| - kmmlu_direct_management                                |exact_match| 0.5030  |±  | 0.0158  |
| - kmmlu_direct_maritime_engineering                      |exact_match| 0.4283  |±  | 0.0202  |
| - kmmlu_direct_marketing                                 |exact_match| 0.7460  |±  | 0.0138  |
| - kmmlu_direct_materials_engineering                     |exact_match| 0.4020  |±  | 0.0155  |
| - kmmlu_direct_math                                      |exact_match| 0.2867  |±  | 0.0262  |
| - kmmlu_direct_mechanical_engineering                    |exact_match| 0.3490  |±  | 0.0151  |
| - kmmlu_direct_nondestructive_testing                    |exact_match| 0.3760  |±  | 0.0153  |
| - kmmlu_direct_patent                                    |exact_match| 0.3700  |±  | 0.0485  |
| - kmmlu_direct_political_science_and_sociology           |exact_match| 0.5300  |±  | 0.0289  |
| - kmmlu_direct_psychology                                |exact_match| 0.4470  |±  | 0.0157  |
| - kmmlu_direct_public_safety                             |exact_match| 0.3520  |±  | 0.0151  |
| - kmmlu_direct_railway_and_automotive_engineering        |exact_match| 0.3220  |±  | 0.0148  |
| - kmmlu_direct_real_estate                               |exact_match| 0.4350  |±  | 0.0351  |
| - kmmlu_direct_refrigerating_machinery                   |exact_match| 0.3240  |±  | 0.0148  |
| - kmmlu_direct_social_welfare                            |exact_match| 0.4970  |±  | 0.0158  |
| - kmmlu_direct_taxation                                  |exact_match| 0.3800  |±  | 0.0344  |
| - kmmlu_direct_telecommunications_and_wireless_technology|exact_match| 0.5480  |±  | 0.0157  |
|kobest_boolq                                              |acc        | 0.9202  |±  | 0.0072  |
|                                                          |f1         | 0.9202  |±  |N/A      |
|kobest_copa                                               |acc        | 0.8680  |±  | 0.0107  |
|                                                          |f1         | 0.8678  |±  |N/A      |
|kobest_hellaswag                                          |acc        | 0.5560  |±  | 0.0222  |
|                                                          |f1         | 0.5520  |±  |N/A      |
|                                                          |acc_norm   | 0.6540  |±  | 0.0213  |
|kobest_sentineg                                           |acc        | 0.9824  |±  | 0.0066  |
|                                                          |f1         | 0.9824  |±  |N/A      |


## Citation

TBD

## Acknowledgements

- Training support was provided by the [TPU Research Cloud](https://sites.research.google/trc/) program.