metadata

library_name: transformers
license: mit
base_model: dbmdz/bert-base-turkish-cased
tags:
  - generated_from_trainer
  - sentiment
  - turkish
metrics:
  - accuracy
  - precision
  - recall
  - f1
model-index:
  - name: turkish-sentiment
    results: []
datasets:
  - winvoker/turkish-sentiment-analysis-dataset
language:
  - tr

turkish-sentiment

This model is a fine-tuned version of dbmdz/bert-base-turkish-cased on winvoker/turkish-sentiment-analysis-dataset dataset. It achieves the following results on the evaluation set:

Loss: 0.0880
Accuracy: 0.9688
F1 Macro: 0.9454
F1 Weighted: 0.9685
Precision: 0.9683
Recall: 0.9688

Model description

A BERT-based(dbmdz Turkish BERT) model fine-tuned on a large-scale Turkish sentiment analysis dataset. This model classifies Turkish text into three sentiment classes: Negative, Notr (Neutral), and Positive.

Model type: BertForSequenceClassification
Base model: dbmdz/bert-base-turkish-cased
Language(s): Turkish

Intended uses & limitations

Turkish text classification tasks involving sentiment analysis.
Suitable for social media data, product reviews, or general-purpose sentiment detection in Turkish.

Usage

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-classification", model="kaixkhazaki/turkish-sentiment")


pipe("Kargo geç geldi ve ürün beklentimi pek karşılamadı.")
>> [{'label': 'Negative', 'score': 0.984860897064209}]

pipe("Yemek lezzetliydi ancak servis yavaş ve çalışanlar ilgisizdi, pek anlayamadım nasıl hissettiğimi.")
>> [{'label': 'Notr', 'score': 0.9881975054740906}]

pipe("Gerçekten müthiş bir deneyimdi, keşke hep burda kalabilsem.")
>> [{'label': 'Positive', 'score': 0.9942901134490967}]

Training and evaluation data

Fine-tuned on a combined dataset with 440,679 training samples and 48,965 validation samples.

Training procedure

Trained on using the entire dataset on a single gpu for apx. 25 mins(1600 steps).

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 64
eval_batch_size: 128
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 400
training_steps: 1600

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1 Macro	F1 Weighted	Precision	Recall
0.3538	0.0581	400	0.1162	0.9582	0.9243	0.9568	0.9572	0.9582
0.1131	0.1162	800	0.1034	0.9639	0.9369	0.9635	0.9633	0.9639
0.1026	0.1743	1200	0.0940	0.9649	0.9411	0.9652	0.9657	0.9649
0.0936	0.2324	1600	0.0880	0.9688	0.9454	0.9685	0.9683	0.9688

Framework versions

Transformers 4.48.0.dev0
Pytorch 2.4.1+cu121
Datasets 3.1.0
Tokenizers 0.21.0

Citation

@misc{turkish-sentiment,
  title={Turkish Sentiment Analysis using Turkish BERT},
  author={Fatih Demrici},
  year={2025},
  howpublished={\url{https://huggingface.co/kaixkhazaki/turkish-sentiment}},
}