GeM2-Llamion-14B

We have released Llamion as GeM 2.0, the second series of generative models developed by VAIV Company to address the our principal business needs.

Llamion (Llamafied Orion) is derived from transforming the Orion model into the standard LLaMA architecture through parameter mapping and offline knowledge transfer. Further technical specifications and study results will be detailed in our upcoming paper, available on this page.

Notably, the LongChat model supports an extensive text range of 200K tokens. The following figure shows the perplexity of models on English Wikipedia corpus and Korean Wikipedia corpus, respectively.

Contributors

VAIV Company AI Lab (vaiv.kr)

vaiv
/

GeM2-Llamion-14B-LongChat

GeM2-Llamion-14B

Contributors

Model tree for vaiv/GeM2-Llamion-14B-LongChat