|
--- |
|
language: fa |
|
license: mit |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
|
|
# SentenceFormalityClassifier |
|
|
|
This model is fine-tuned to classify text based on formality. It has been fine-tuned on [Mohavere Dataset] (Takalli vahideh, Kalantari, Fateme, Shamsfard, Mehrnoush, Developing an Informal-Formal Persian Corpus, 2022.) using the pretrained model [persian-t5-formality-transfer](https://huggingface.co/HooshvareLab/bert-base-parsbert-uncased). |
|
|
|
|
|
## Evaluation Metrics |
|
|
|
**INFORMAL**: |
|
Precision: 0.99 |
|
Recall: 0.99 |
|
F1-Score: 0.99 |
|
|
|
|
|
**FORMAL**: |
|
Precision: 0.99 |
|
Recall: 1.0 |
|
F1-Score: 0.99 |
|
|
|
**Accuracy**: 0.99 |
|
|
|
|
|
**Macro Avg**: |
|
Precision: 0.99 |
|
Recall: 0.99 |
|
F1-Score: 0.99 |
|
|
|
|
|
**Weighted Avg**: |
|
Precision: 0.99 |
|
Recall: 0.99 |
|
F1-Score: 0.99 |
|
|
|
|
|
## Usage |
|
|
|
```python |
|
|
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
import torch |
|
|
|
labels = ["INFORMAL", "FORMAL"] |
|
|
|
model = AutoModelForSequenceClassification.from_pretrained('parsi-ai-nlpclass/sentence_formality_classifier') |
|
tokenizer = AutoTokenizer.from_pretrained('parsi-ai-nlpclass/sentence_formality_classifier') |
|
|
|
def test_model(text): |
|
inputs = tokenizer(text, return_tensors='pt') |
|
outputs = model(**inputs) |
|
predicted_label = labels[int(torch.argmax(outputs.logits))] |
|
return predicted_label |
|
|
|
# Test the model |
|
text1 = "من فقط میخواستم بگویم که چقدر قدردان هستم." |
|
print("Original:", text1) |
|
print("Predicted Label:", test_model(text1)) |
|
|
|
# output: FORMAL |
|
|
|
text2 = "آرزویش است او را یک رستوران ببرم." |
|
print("\nOriginal:", text2) |
|
print("Predicted Label:", test_model(text2)) |
|
|
|
# output: FORMAL |
|
|
|
text3 = "گل منو اذیت نکنید" |
|
print("\nOriginal:", text2) |
|
print("Predicted Label:", test_model(text3)) |
|
|
|
# output: INFORMAL |
|
|
|
text4 = "من این دوربین رو خالم برام کادو خرید" |
|
print("\nOriginal:", text2) |
|
print("Predicted Label:", test_model(text3)) |
|
|
|
# output: INFORMAL |
|
|
|
|
|
|
|
``` |