Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,68 @@
|
|
|
|
|
|
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
2 |
license: mit
|
3 |
-
|
4 |
-
-
|
5 |
-
|
6 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Конечно, вот карточка на русском языке:
|
2 |
+
|
3 |
+
```markdown
|
4 |
---
|
5 |
+
language: ru
|
6 |
+
tags:
|
7 |
+
- spam-detection
|
8 |
+
- text-classification
|
9 |
+
- russian
|
10 |
license: mit
|
11 |
+
datasets:
|
12 |
+
- RUSpam/spam_dataset_v4
|
13 |
+
metrics:
|
14 |
+
- F1
|
15 |
+
model-index:
|
16 |
+
- name: spam_deberta_v4
|
17 |
+
results:
|
18 |
+
- task:
|
19 |
+
name: Классификация текста
|
20 |
+
type: text-classification
|
21 |
+
dataset:
|
22 |
+
name: RUSpam/russian_spam_dataset
|
23 |
+
type: RUSpam/russian_spam_dataset
|
24 |
+
metrics:
|
25 |
+
- name: F1
|
26 |
+
type: F1
|
27 |
+
value: 0.9897
|
28 |
+
---
|
29 |
+
|
30 |
+
# RUSpam/spam_deberta_v4
|
31 |
+
|
32 |
+
## Описание
|
33 |
+
|
34 |
+
Это модель определения спама, основанная на архитектуре Deberta, дообученная на русскоязычных данных о спаме. Она классифицирует текст как спам или не спам.
|
35 |
+
|
36 |
+
## Использование
|
37 |
+
|
38 |
+
|
39 |
+
```python
|
40 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
41 |
+
import torch
|
42 |
+
|
43 |
+
model_path = "RUSpam/spam_deberta_v4"
|
44 |
+
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
45 |
+
model = AutoModelForSequenceClassification.from_pretrained(model_path)
|
46 |
+
|
47 |
+
def predict(text):
|
48 |
+
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
|
49 |
+
with torch.no_grad():
|
50 |
+
outputs = model(**inputs)
|
51 |
+
logits = outputs.logits
|
52 |
+
predicted_class = torch.argmax(logits, dim=1).item()
|
53 |
+
return "Спам" if predicted_class == 1 else "Не спам"
|
54 |
+
|
55 |
+
text = "Ваш текст для проверки здесь"
|
56 |
+
result = predict(text)
|
57 |
+
print(f"Результат: {result}")
|
58 |
+
```
|
59 |
+
|
60 |
+
# Цитирование
|
61 |
+
```
|
62 |
+
@MISC{RUSpam/spam_deberta_v4,
|
63 |
+
author = {Denis Petrov, Kirill, Sergey Yalovegin},
|
64 |
+
title = {Russian Spam Classification Model},
|
65 |
+
url = {https://huggingface.co/RUSpam/spam_deberta_v4/},
|
66 |
+
year = 2024
|
67 |
+
}
|
68 |
+
```
|