kartashoffv
/

vashkontrol-sentiment-rubert

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

kartashoffv commited on Aug 6, 2023

Commit

ad3250c

•

1 Parent(s): 5edd151

Update README.md (#2)

- Update README.md (ada69199b5fb054de9d56d354fe89a08bf3f16f0)

Files changed (1) hide show

README.md +42 -11

README.md CHANGED Viewed

@@ -19,30 +19,33 @@ widget:
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# vashkontrol-sentiment-rubert
-This model is a fine-tuned version of [DeepPavlov/rubert-base-cased](https://huggingface.co/DeepPavlov/rubert-base-cased) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.1085
 - F1: 0.9461
 ## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -71,4 +74,32 @@ The following hyperparameters were used during training:
 - Transformers 4.31.0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.1
-- Tokenizers 0.13.3

 ---
+# Sentimental assessment of portal reviews "VashKontrol"
+The model is designed to evaluate the tone of reviews from the [VashKontrol portal](https://vashkontrol.ru/).
+This model is a fine-tuned version of [DeepPavlov/rubert-base-cased](https://huggingface.co/DeepPavlov/rubert-base-cased) on a following dataset: [kartashoffv/vash_kontrol_reviews](https://huggingface.co/datasets/kartashoffv/vash_kontrol_reviews).
 It achieves the following results on the evaluation set:
 - Loss: 0.1085
 - F1: 0.9461
 ## Model description
+The model predicts a sentiment label (positive, neutral, negative) for a submitted text review.
 ## Training and evaluation data
+The model was trained on the corpus of reviews of the [VashControl portal](https://vashkontrol.ru/), left by users in the period from 2020 to 2022 inclusive.
+The total number of reviews was 17,385. The sentimental assessment of the dataset was carried out by the author manually by dividing the general dataset into positive/neutral/negative reviews.
+The resulting classes:
+0 (positive): 13045
+1 (neutral): 1196
+2 (negative): 3144
+Class weighting was used to solve the class imbalance.
 ### Training hyperparameters
 - Transformers 4.31.0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.1
+- Tokenizers 0.13.3
+### Usage
+```
+import torch
+from transformers import AutoModelForSequenceClassification
+from transformers import BertTokenizerFast
+tokenizer = BertTokenizerFast.from_pretrained('kartashoffv/vashkontrol-sentiment-rubert')
+model = AutoModelForSequenceClassification.from_pretrained('kartashoffv/vashkontrol-sentiment-rubert', return_dict=True)
+@torch.no_grad()
+def predict(review):
+    inputs = tokenizer(review, max_length=512, padding=True, truncation=True, return_tensors='pt')
+    outputs = model(**inputs)
+    predicted = torch.nn.functional.softmax(outputs.logits, dim=1)
+    pred_label = torch.argmax(predicted, dim=1).numpy()
+    return pred_label
+```
+### Labels
+```
+0: POSITIVE
+1: NEUTRAL
+2: NEGATIVE
+```