BurmeseBert / README.md
jojo-ai-mst's picture
Update README.md
9e557cb verified
---
library_name: transformers
language:
- my
- en
---
# Burmese-Bert
Burmese-Bert is a Bilingual Mask Language Model based on "bert-large-uncased".
The architecture is based on bidirectional encoder representations from transformers.
Supports English and Burmese language.
## Model Details
Coming Soon
### Model Description
- **Developed by:** Min Si Thu
- **Model type:** bidirectional encoder representations from transformers
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [More Information Needed]
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## Uses
- Mask Filling Language Model
- Burmese Natural Language Understanding
### How to use
```shell
# install the dependencies
pip install transformers
```
```python
from transformers import AutoModelForMaskedLM,AutoTokenizer
model_checkpoint = "jojo-ai-mst/BurmeseBert"
model = AutoModelForMaskedLM.from_pretrained(model_checkpoint)
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
text = "This is a great [MASK]."
import torch
inputs = tokenizer(text, return_tensors="pt")
token_logits = model(**inputs).logits
# Find the location of [MASK] and extract its logits
mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]
mask_token_logits = token_logits[0, mask_token_index, :]
# Pick the [MASK] candidates with the highest logits
top_5_tokens = torch.topk(mask_token_logits, 5, dim=1).indices[0].tolist()
for token in top_5_tokens:
print(f"'>>> {text.replace(tokenizer.mask_token, tokenizer.decode([token]))}'")
```
## Citation [optional]
Coming Soon