Kinyarwanda GPT-2 model based on Andrej Karpathy's nanoGPT. It was trained on a mixture of news data and diverse computer-generated datasets in Kinyarwanda.
Model configuration
- number of layers = 6
- number of heads = 6
- embeddings = 384
- block size = 256
Model dependencies
pip install transformers datasets tiktoken wandb tqdm
Usage
To use this model to generate text, download all the files in the repo and put them in the same directory, then run the following command:
python sample.py --out_dir=.