mlengineer-ai
/

kenlm-sp-jomleh

Model card Files Files and versions Community

mehran commited on May 12, 2023

Commit

dfd4faa

•

1 Parent(s): 4cba973

Update README.md

Files changed (1) hide show

README.md +9 -3

README.md CHANGED Viewed

@@ -64,15 +64,15 @@ from model import KenlmModel
 # Load the model
-model = KenlmModel.from_pretrained("32000", "5", "01111")
 # Get perplexity
 print(model.perplexity("من در را بستم"))
-# Outputs: 19.0
 # Get score
 print(model.score("من در را بستم"))
-# Outputs: -8.94505500793457
 ```
 # What are the different files you can find in this repository?
@@ -124,3 +124,9 @@ using the `build_binary` program, as shown below:
 ```
 build_binary -T /tmp -S 80% probing jomleh-sp-32000-o5-prune01111.arpa jomleh-sp-32000-o5-prune01111.probing
 ```

 # Load the model
+model = KenlmModel.from_pretrained("57218", "3", "011")
 # Get perplexity
 print(model.perplexity("من در را بستم"))
+# Outputs: 72.5
 # Get score
 print(model.score("من در را بستم"))
+# Outputs: -11.160577774047852
 ```
 # What are the different files you can find in this repository?
 ```
 build_binary -T /tmp -S 80% probing jomleh-sp-32000-o5-prune01111.arpa jomleh-sp-32000-o5-prune01111.probing
 ```
+# Which model to use?
+Based on my personal evaluation, I recommend using the `jomleh-sp-57218-o3-prune011.probing`.
+It's the perfect balanced between file size (6GB) and accuracy (80%). But if you have no concern for file
+size, then go for the largest model, `jomleh-sp-57218-o5-prune00011.probing` (size: 36GB, accuracy: 82%).