T‑LLaMA: a Tibetan large language model based on LLaMA2

In this study, we built a corpus containing 2.2 billion Tibetan characters and trained Tibetan LLaMA based on LLaMA2 7B. We achieved state-of-the-art performance in the text classification task using the open-source TNCC dataset, with an accuracy of 79.8%. Additionally, we obtained promising results in text generation and text summarization tasks.

Downloads last month
66
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Pagewood/T-LLaMA

Quantizations
1 model