georeactor commited on
Commit
94c694a
1 Parent(s): 35f9592

recommend larger models

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -6,7 +6,9 @@ language: hi
6
 
7
  This is a first attempt at a Hindi language model trained with Google Research's [ELECTRA](https://github.com/google-research/electra).
8
 
9
- **Consider using this newer, larger model: https://huggingface.co/monsoon-nlp/hindi-tpu-electra**
 
 
10
 
11
  <a href="https://colab.research.google.com/drive/1R8TciRSM7BONJRBc9CBZbzOmz39FTLl_">Tokenization and training CoLab</a>
12
 
 
6
 
7
  This is a first attempt at a Hindi language model trained with Google Research's [ELECTRA](https://github.com/google-research/electra).
8
 
9
+ **As of 2022 I recommend Google's MuRIL model trained on English, Hindi, and other major Indian languages, both in their script and latinized script**: https://huggingface.co/google/muril-base-cased and https://huggingface.co/google/muril-large-cased
10
+
11
+ **For causal language models, I would suggest SberBank / mGPT, though this is a large model**
12
 
13
  <a href="https://colab.research.google.com/drive/1R8TciRSM7BONJRBc9CBZbzOmz39FTLl_">Tokenization and training CoLab</a>
14