--- license: apache-2.0 --- # **GeM2-Llamion-14B** We have released **Llamion** as **GeM 2.0**, the second series of generative models developed by VAIV Company to address the our principal business needs. **Llamion** (Llamafied Orion) is derived from transforming the [Orion model](https://huggingface.co/OrionStarAI/Orion-14B-LongChat) into [the standard LLaMA architecture](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py) through parameter mapping and offline knowledge transfer. Further technical specifications and study results will be detailed in our upcoming paper, available on this page. ![vaiv_png](./vaiv.png) Notably, the LongChat model supports an extensive text range of 200K tokens. The following figure shows the perplexity of models on [English Wikipedia corpus](https://huggingface.co/datasets/wikimedia/wikipedia/viewer/20231101.en) and [Korean Wikipedia corpus](https://huggingface.co/datasets/wikimedia/wikipedia/viewer/20231101.ko), respectively. ![ppl_enwiki](./ppl_enwiki.png) ![ppl_kowiki](./ppl_kowiki.png) ### Contributors - VAIV Company AI Lab ([vaiv.kr](https://www.vaiv.kr/))