Aspect Based Sentiment Analysis with Turkish 🇹🇷 Data

This model performs Aspect-Based Sentiment Analysis (ABSA) 🚀 for Turkish text. It predicts sentiment polarity (Positive, Neutral, Negative) towards specific aspects within a given sentence.

Model Details

Model Description

This model is fine-tuned from the dbmdz/bert-base-turkish-cased pretrained BERT model. It is trained on the Turkish-ABSA-Wsynthetic dataset, which contains Turkish restaurant reviews annotated with aspect-based sentiments. The model is capable of identifying the sentiment polarity for specific aspects (e.g., "servis," "fiyatlar") mentioned in Turkish sentences.

Developed by: Sengil
Language(s): Turkish 🇹🇷
License: Apache-2.0
Finetuned from model: dbmdz/bert-base-turkish-cased
Number of Labels: 3 (Negative, Neutral, Positive)

Sources

Notebook: ABSA_Turkish_BERT_Based_Small

Uses

Direct Use

This model can be used directly for analyzing aspect-specific sentiment in Turkish text, especially in domains like restaurant reviews.

Downstream Use

It can be fine-tuned for similar tasks in different domains (e.g., e-commerce, hotel reviews, or customer feedback analysis).

Out-of-Scope Use

Not suitable for tasks unrelated to sentiment analysis or Turkish language.
May not perform well on datasets with significantly different domain-specific vocabulary.

Limitations

May struggle with rare or ambiguous aspects not covered in the training data.
May exhibit biases present in the training dataset.

How to Get Started with the Model

!pip install -U transformers

Use the code below to get started with the model:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Sengil/ABSA-Turkish-bert-based-small")
model = AutoModelForSequenceClassification.from_pretrained("Sengil/ABSA-Turkish-bert-based-small")

# Example inference
text = "Servis çok yavaştı ama yemekler lezzetliydi."
aspect = "servis"
formatted_text = f"[CLS] {text} [SEP] {aspect} [SEP]"

inputs = tokenizer(formatted_text, return_tensors="pt", padding="max_length", truncation=True, max_length=128)
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=1).item()

# Map prediction to label
labels = {0: "Negative", 1: "Neutral", 2: "Positive"}
print(f"Sentiment for '{aspect}': {labels[predicted_class]}")

Training Details

Training Data

Training Data The model was fine-tuned on the Turkish-ABSA-Wsynthetic.csv dataset. The dataset contains semi-synthetic Turkish sentences annotated for aspect-based sentiment analysis.

Training Procedure
Optimizer: AdamW
Learning Rate: 2e-5
Batch Size: 16
Epochs: 5
Max Sequence Length: 128

Evaluation

The model achieved the following scores on the test set:

Accuracy: 95.48%
F1 Score (Weighted): 95.46%

Citation

@misc{absa_turkish_bert_based_small,
  title={Aspect-Based Sentiment Analysis for Turkish},
  author={Sengil},
  year={2024},
  url={https://huggingface.co/Sengil/ABSA_Turkish_BERT_Based_Small}
}

Model Card Contact

For any questions or issues, please open an issue in the repository or contact LinkedIN.

Sengil
/

ABSA-Turkish-bert-based-small