keshan commited on
Commit
ced582d
1 Parent(s): 93701c7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: si
3
+ tags:
4
+ - Sinhala
5
+ - text-generation
6
+ - gpt2
7
+ datasets:
8
+ - mc4
9
+ ---
10
+ ### Overview
11
+
12
+ This is a smaller GPT2 model trained on [MC4](https://github.com/allenai/allennlp/discussions/5056) Sinhala dataset. As Sinhala is one of those low resource languages, there are only a handful of models been trained. So, this would be a great place to start training for more downstream tasks.
13
+
14
+ ## Model Specification
15
+
16
+
17
+ The model chosen for training is GPT2 with the following specifications:
18
+ 1. vocab_size=50257
19
+ 2. n_embd=768
20
+ 3. n_head=12
21
+ 4. n_layer=12
22
+ 5. n_positions=1024
23
+
24
+ ## How to Use
25
+ You can use this model directly with a pipeline for masked language modeling:
26
+
27
+ ```py
28
+ from transformers import pipeline
29
+ generator = pipeline('text-generation', model='keshan/sinhala-gpt2')
30
+ generator("මම", max_length=50, num_return_sequences=5)
31
+ ```