GeM2-Llamion-14B
We have released Llamion as GeM 2.0, the second series of generative models developed by VAIV Company to address the our principal business needs.
Llamion (Llamafied Orion) is derived from transforming the Orion model into the standard LLaMA architecture through parameter mapping and offline knowledge transfer. Further technical specifications and study results will be detailed in our upcoming paper, available on this page.
Notably, the LongChat model supports an extensive text range of 200K tokens. The following figure shows the perplexity of models on English Wikipedia corpus and Korean Wikipedia corpus, respectively.
Contributors
- VAIV Company AI Lab (vaiv.kr)
- Downloads last month
- 2,878
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.