ksyang's picture
Update README.md
ba2f0db verified
|
raw
history blame
1.17 kB
metadata
license: apache-2.0

GeM2-Llamion-14B

We have released Llamion as GeM 2.0, the second series of generative models developed by VAIV Company to address the our principal business needs.

Llamion (Llamafied Orion) is derived from transforming the Orion model into the standard LLaMA architecture through parameter mapping and offline knowledge transfer. Further technical specifications and study results will be detailed in our upcoming paper, available on this page.

vaiv_png

Notably, the LongChat model supports an extensive text range of 200K tokens. The following figure shows the perplexity of models on English Wikipedia corpus and Korean Wikipedia corpus, respectively.

ppl_enwiki

ppl_kowiki

Contributors