sagorsarker
commited on
Commit
•
6dc6003
1
Parent(s):
1c01dc3
Update README.md
Browse files
README.md
CHANGED
@@ -82,6 +82,21 @@ Here is the [evaluation script](https://github.com/sagorbrur/bangla-bert/blob/ma
|
|
82 |
|
83 |
|
84 |
## How to Use
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
85 |
You can use this model directly with a pipeline for masked language modeling:
|
86 |
|
87 |
```py
|
@@ -97,7 +112,6 @@ for pred in nlp(f"আমি বাংলায় {nlp.tokenizer.mask_token} গা
|
|
97 |
|
98 |
```
|
99 |
|
100 |
-
|
101 |
## Author
|
102 |
[Sagor Sarker](https://github.com/sagorbrur)
|
103 |
|
|
|
82 |
|
83 |
|
84 |
## How to Use
|
85 |
+
|
86 |
+
**Bangla BERT Tokenizer**
|
87 |
+
|
88 |
+
```py
|
89 |
+
from transformers import AutoTokenizer, AutoModel
|
90 |
+
|
91 |
+
bnbert_tokenizer = AutoTokenizer.from_pretrained("sagorsarker/bangla-bert-base")
|
92 |
+
text = "আমি বাংলায় গান গাই।"
|
93 |
+
bnbert_tokenizer.tokenize(text)
|
94 |
+
# ['আমি', 'বাংলা', '##য', 'গান', 'গাই', '।']
|
95 |
+
```
|
96 |
+
|
97 |
+
|
98 |
+
**MASK Generation**
|
99 |
+
|
100 |
You can use this model directly with a pipeline for masked language modeling:
|
101 |
|
102 |
```py
|
|
|
112 |
|
113 |
```
|
114 |
|
|
|
115 |
## Author
|
116 |
[Sagor Sarker](https://github.com/sagorbrur)
|
117 |
|