Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
**Train-Test Set:** "teknofest_train_final.csv"
|
2 |
+
|
3 |
+
**Model:** "dbmdz/bert-base-turkish-128k-uncased"
|
4 |
+
|
5 |
+
**Önişleme**
|
6 |
+
- Büyük karakterler öncesine special token (#) eklenip sonrasında karakterler küçültülmüştür
|
7 |
+
- Noktalama işaretleri silinmiştir
|
8 |
+
|
9 |
+
## Tokenizer Parametreleri
|
10 |
+
```
|
11 |
+
max_length=64
|
12 |
+
padding=True
|
13 |
+
truncation=True
|
14 |
+
```
|
15 |
+
|
16 |
+
## Eğitim Parametreleri
|
17 |
+
- **Epoch:** 3
|
18 |
+
- **Learning Rate:** 7e-5
|
19 |
+
- **Batch-Size:** 64
|
20 |
+
- **Tokenizer Length:** 64
|
21 |
+
- **Loss:** BCE
|
22 |
+
- **Online Hard Example Mining:** Açık
|
23 |
+
- **Class-Weighting:** Açık (^0.3)
|
24 |
+
- **Early Stopping:** Kapalı
|
25 |
+
- **Stratified Batch Sampling:** Açık
|
26 |
+
- **Gradient Accumulation:** Kapalı
|
27 |
+
- **LR Scheduler:** Cosine-with-Warmup
|
28 |
+
- **Warmup Ratio:** 0.1
|
29 |
+
- **Weight Decay:** 0.01
|
30 |
+
- **LLRD:** 0.95
|
31 |
+
- **Label Smoothing:** 0.05
|
32 |
+
- **Gradient Clipping:** 1.0
|
33 |
+
- **MLM Pre-Training:** Kapalı
|
34 |
+
|
35 |
+
|
36 |
+
## CV10 Sonuçları
|
37 |
+
```
|
38 |
+
precision recall f1-score support
|
39 |
+
|
40 |
+
INSULT 0.9172 0.9260 0.9216 2393
|
41 |
+
OTHER 0.9681 0.9646 0.9663 3528
|
42 |
+
PROFANITY 0.9627 0.9571 0.9599 2376
|
43 |
+
RACIST 0.9684 0.9651 0.9667 2033
|
44 |
+
SEXIST 0.9618 0.9668 0.9643 2081
|
45 |
+
|
46 |
+
accuracy 0.9562 12411
|
47 |
+
macro avg 0.9557 0.9559 0.9558 12411
|
48 |
+
weighted avg 0.9563 0.9562 0.9562 12411
|
49 |
+
```
|