Aspect Based Sentiment Analysis with Turkish ๐น๐ท Data
This model performs Aspect-Based Sentiment Analysis (ABSA) ๐ for Turkish text. It predicts sentiment polarity (Positive, Neutral, Negative) towards specific aspects within a given sentence.
Model Details
Model Description
This model is fine-tuned from the dbmdz/bert-base-turkish-cased
pretrained BERT model. It is trained on the Turkish-ABSA-Wsynthetic dataset, which contains Turkish restaurant reviews annotated with aspect-based sentiments. The model is capable of identifying the sentiment polarity for specific aspects (e.g., "servis," "fiyatlar") mentioned in Turkish sentences.
- Developed by: Sengil
- Language(s): Turkish ๐น๐ท
- License: Apache-2.0
- Finetuned from model:
dbmdz/bert-base-turkish-cased
- Number of Labels: 3 (Negative, Neutral, Positive)
Sources
- Notebook: ABSA_Turkish_BERT_Based_Small
Uses
Direct Use
This model can be used directly for analyzing aspect-specific sentiment in Turkish text, especially in domains like restaurant reviews.
Downstream Use
It can be fine-tuned for similar tasks in different domains (e.g., e-commerce, hotel reviews, or customer feedback analysis).
Out-of-Scope Use
- Not suitable for tasks unrelated to sentiment analysis or Turkish language.
- May not perform well on datasets with significantly different domain-specific vocabulary.
Limitations
- May struggle with rare or ambiguous aspects not covered in the training data.
- May exhibit biases present in the training dataset.
How to Get Started with the Model
!pip install -U transformers
Use the code below to get started with the model:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load the model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("Sengil/ABSA_Turkish_BERT_Based_Small")
tokenizer = AutoTokenizer.from_pretrained("Sengil/ABSA_Turkish_BERT_Based_Small")
# Example inference
text = "Servis รงok yavaลtฤฑ ama yemekler lezzetliydi."
aspect = "servis"
formatted_text = f"[CLS] {text} [SEP] {aspect} [SEP]"
inputs = tokenizer(formatted_text, return_tensors="pt", padding="max_length", truncation=True, max_length=128)
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=1).item()
# Map prediction to label
labels = {0: "Negative", 1: "Neutral", 2: "Positive"}
print(f"Sentiment for '{aspect}': {labels[predicted_class]}")
Training Details
Training Data
Training Data The model was fine-tuned on the Turkish-ABSA-Wsynthetic.csv dataset. The dataset contains semi-synthetic Turkish sentences annotated for aspect-based sentiment analysis.
- Training Procedure
- Optimizer: AdamW
- Learning Rate: 2e-5
- Batch Size: 16
- Epochs: 5
- Max Sequence Length: 128
Evaluation
The model achieved the following scores on the test set:
- Accuracy: 95.48%
- F1 Score (Weighted): 95.46%
Citation
@misc{absa_turkish_bert_based_small,
title={Aspect-Based Sentiment Analysis for Turkish},
author={Sengil},
year={2024},
url={https://huggingface.co/Sengil/ABSA_Turkish_BERT_Based_Small}
}
Model Card Contact
For any questions or issues, please open an issue in the repository or contact LinkedIN.
- Downloads last month
- 85
Model tree for Sengil/ABSA-Turkish-bert-based-small
Base model
dbmdz/bert-base-turkish-cased