File size: 1,364 Bytes
675ddb5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
**Train-Test Set:** "teknofest_train_final.csv"
**Model:** "dbmdz/bert-base-turkish-128k-uncased"
**Önişleme**
- Büyük karakterler öncesine special token (#) eklenip sonrasında karakterler küçültülmüştür
- Noktalama işaretleri silinmiştir
## Tokenizer Parametreleri
```
max_length=64
padding=True
truncation=True
```
## Eğitim Parametreleri
- **Epoch:** 3
- **Learning Rate:** 7e-5
- **Batch-Size:** 64
- **Tokenizer Length:** 64
- **Loss:** BCE
- **Online Hard Example Mining:** Açık
- **Class-Weighting:** Açık (^0.3)
- **Early Stopping:** Kapalı
- **Stratified Batch Sampling:** Açık
- **Gradient Accumulation:** Kapalı
- **LR Scheduler:** Cosine-with-Warmup
- **Warmup Ratio:** 0.1
- **Weight Decay:** 0.01
- **LLRD:** 0.95
- **Label Smoothing:** 0.05
- **Gradient Clipping:** 1.0
- **MLM Pre-Training:** Kapalı
## CV10 Sonuçları
```
precision recall f1-score support
INSULT 0.9172 0.9260 0.9216 2393
OTHER 0.9681 0.9646 0.9663 3528
PROFANITY 0.9627 0.9571 0.9599 2376
RACIST 0.9684 0.9651 0.9667 2033
SEXIST 0.9618 0.9668 0.9643 2081
accuracy 0.9562 12411
macro avg 0.9557 0.9559 0.9558 12411
weighted avg 0.9563 0.9562 0.9562 12411
``` |