Bangla FastText Model

This is a FastText pre-trained model for the Bengali language.

This model is build for bnlp package.

Datasets

Training Details

  • Fasttext trained with total words = 20M, vocab size = 1171011, epoch=50, embedding dimension = 300

Evaluation Details

  • training loss = 0.318668

Usage

  • pip install -U bnlp_toolkit
  • pip install fasttext==0.9.2
  • Generate Vector Using Pretrained Model
from bnlp.embedding.fasttext import BengaliFasttext

bft = BengaliFasttext()
word = "গ্রাম"
model_path = "bengali_fasttext_wiki.bin"
word_vector = bft.generate_word_vector(model_path, word)
print(word_vector.shape)
print(word_vector)
  • Train Bengali FastText Model
from bnlp.embedding.fasttext import BengaliFasttext

bft = BengaliFasttext()
data = "raw_text.txt"
model_name = "saved_model.bin"
epoch = 50
bft.train(data, model_name, epoch)
  • Generate Vector File from Fasttext Binary Model
from bnlp.embedding.fasttext import BengaliFasttext

bft = BengaliFasttext()

model_path = "mymodel.bin"
out_vector_name = "myvector.txt"
bft.bin2vec(model_path, out_vector_name)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.