Behpouyan
/

Behpouyan-Fill-Mask

Inference Endpoints

Model card Files Files and versions Community

Behpouyan-Fill-Mask / README.md

Behpouyan's picture

Update README.md

2618c2d verified about 1 month ago

|

2.2 kB

	---
	library_name: transformers
	tags: [Fill Mask, Persian , BERT]
	---

	## Model Details

	### Model Description

	This model is fine-tuned for the task of masked language modeling in Persian. The model can predict missing words in Persian sentences when a word is replaced by the [MASK] token. It is useful for a range of NLP applications, including text completion, question answering, and contextual understanding of Persian texts.

	- Developed by: Behpouyan
	- Model type: Encoder
	- Language(s) (NLP): Persian


	## How to Get Started with the Model

	``` python
	from transformers import AutoTokenizer, AutoModelForMaskedLM
	import torch

	# Load the tokenizer and model
	tokenizer = AutoTokenizer.from_pretrained("Behpouyan/Behpouyan-Fill-Mask")
	model = AutoModelForMaskedLM.from_pretrained("Behpouyan/Behpouyan-Fill-Mask")

	# List of 5 Persian sentences with a masked word (replacing a word with [MASK])
	sentences = [
	"این کتاب بسیار <mask> است.", # The book is very <mask
	"مشتری همیشه از <mask> شما راضی است.", # The customer is always satisfied with your <mask
	"من به دنبال <mask> هستم.", # I am looking for <mask
	"این پروژه نیاز به <mask> دارد.", # This project needs <mask
	"تیم ما برای انجام کارها <mask> است." # Our team is <mask to do the tasks
	]

	# Function to predict masked words
	def predict_masked_word(sentence):
	# Tokenize the input sentence
	inputs = tokenizer(sentence, return_tensors="pt")

	# Forward pass to get logits
	with torch.no_grad():
	outputs = model(**inputs)
	logits = outputs.logits

	# Get the position of the [MASK] token
	mask_token_index = torch.where(inputs.input_ids == tokenizer.mask_token_id)[1].item()

	# Get the predicted token
	predicted_token_id = torch.argmax(logits[0, mask_token_index]).item()
	predicted_word = tokenizer.decode([predicted_token_id])

	return predicted_word

	# Test the model on the sentences
	for sentence in sentences:
	predicted_word = predict_masked_word(sentence)
	print(f"Sentence: {sentence}")
	print(f"Predicted word: {predicted_word}")
	print("-" * 50)
	```