tirthadagr8 commited on
Commit
fb4b5f2
·
verified ·
1 Parent(s): 68133bb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -3
README.md CHANGED
@@ -1,3 +1,13 @@
1
- ---
2
- license: gemma
3
- ---
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gemma
3
+ ---
4
+ Made using Gpt-Small from scratch for learning purpose.
5
+ Tokenizer used is from Gemma 2-2B-JPN-IT which is trained on japanese dataset from JESC.
6
+ ```bibtex
7
+ @ARTICLE{pryzant_jesc_2018,
8
+ author = {{Pryzant}, R. and {Chung}, Y. and {Jurafsky}, D. and {Britz}, D.},
9
+ title = "{JESC: Japanese-English Subtitle Corpus}",
10
+ journal = {Language Resources and Evaluation Conference (LREC)},
11
+ keywords = {Computer Science - Computation and Language},
12
+ year = 2018
13
+ }