metadata
library_name: transformers
license: mit
base_model: dbmdz/bert-base-turkish-cased
tags:
- generated_from_trainer
- sentiment
- turkish
metrics:
- accuracy
- precision
- recall
- f1
model-index:
- name: turkish-sentiment
results: []
datasets:
- winvoker/turkish-sentiment-analysis-dataset
language:
- tr
turkish-sentiment
This model is a fine-tuned version of dbmdz/bert-base-turkish-cased on winvoker/turkish-sentiment-analysis-dataset dataset. It achieves the following results on the evaluation set:
- Loss: 0.0880
- Accuracy: 0.9688
- F1 Macro: 0.9454
- F1 Weighted: 0.9685
- Precision: 0.9683
- Recall: 0.9688
Model description
A BERT-based(dbmdz Turkish BERT) model fine-tuned on a large-scale Turkish sentiment analysis dataset. This model classifies Turkish text into three sentiment classes: Negative, Notr (Neutral), and Positive.
- Model type: BertForSequenceClassification
- Base model: dbmdz/bert-base-turkish-cased
- Language(s): Turkish
Intended uses & limitations
- Turkish text classification tasks involving sentiment analysis.
- Suitable for social media data, product reviews, or general-purpose sentiment detection in Turkish.
Usage
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="kaixkhazaki/turkish-sentiment")
pipe("Kargo geç geldi ve ürün beklentimi pek karşılamadı.")
>> [{'label': 'Negative', 'score': 0.984860897064209}]
pipe("Yemek lezzetliydi ancak servis yavaş ve çalışanlar ilgisizdi, pek anlayamadım nasıl hissettiğimi.")
>> [{'label': 'Notr', 'score': 0.9881975054740906}]
pipe("Gerçekten müthiş bir deneyimdi, keşke hep burda kalabilsem.")
>> [{'label': 'Positive', 'score': 0.9942901134490967}]
Training and evaluation data
Fine-tuned on a combined dataset with 440,679 training samples and 48,965 validation samples.
Training procedure
Trained on using the entire dataset on a single gpu for apx. 25 mins(1600 steps).
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 64
- eval_batch_size: 128
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 400
- training_steps: 1600
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 Macro | F1 Weighted | Precision | Recall |
---|---|---|---|---|---|---|---|---|
0.3538 | 0.0581 | 400 | 0.1162 | 0.9582 | 0.9243 | 0.9568 | 0.9572 | 0.9582 |
0.1131 | 0.1162 | 800 | 0.1034 | 0.9639 | 0.9369 | 0.9635 | 0.9633 | 0.9639 |
0.1026 | 0.1743 | 1200 | 0.0940 | 0.9649 | 0.9411 | 0.9652 | 0.9657 | 0.9649 |
0.0936 | 0.2324 | 1600 | 0.0880 | 0.9688 | 0.9454 | 0.9685 | 0.9683 | 0.9688 |
Framework versions
- Transformers 4.48.0.dev0
- Pytorch 2.4.1+cu121
- Datasets 3.1.0
- Tokenizers 0.21.0
Citation
@misc{turkish-sentiment,
title={Turkish Sentiment Analysis using Turkish BERT},
author={Fatih Demrici},
year={2025},
howpublished={\url{https://huggingface.co/kaixkhazaki/turkish-sentiment}},
}