license: llama2 | |
language: | |
- si | |
base_model: meta-llama/Llama-2-7b-hf | |
library_name: transformers | |
# Llama2 7B for Sinhala: No vocabulary adaptation | |
This model is built on top of Llama2 7B adapted for Sinhala using 30K target language sentences sampled from CC-100. | |
## Model Details | |
* **Vocabulary**: This model has no additional target vocabulary. It retains the original vocabulary of Llama2 7B. | |
## Model Description | |
- **Language:** Sinhala | |
- **License:** Llama 2 Community License Agreement | |
- **Fine-tuned from model:** meta-llama/Llama-2-7b-hf | |
## Model Sources | |
- **Repository:** https://github.com/gucci-j/lowres-cve | |
- **Paper:** https://arxiv.org/abs/2406.11477 | |
## How to Get Started with the Model | |
Use the code below to get started with the model. | |
```python | |
from transformers import AutoTokenizer, AutoModelForCausalLM | |
model = AutoModelForCausalLM.from_pretrained( | |
"atsuki-yamaguchi/Llama-2-7b-hf-si-30K-lapt" | |
) | |
tokenizer = AutoTokenizer.from_pretrained( | |
"atsuki-yamaguchi/Llama-2-7b-hf-si-30K-lapt" | |
) | |
``` | |