seara
/

rubert-tiny2-russian-sentiment

Text Classification

sentiment-analysis

multi-class-classification

sentiment analysis

Inference Endpoints

Model card Files Files and versions Community

seara commited on Aug 24, 2023

Commit

dd9f2a3

·

1 Parent(s): bd71104

Create README.md

Files changed (1) hide show

README.md +80 -0

README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: mit
+language:
+- ru
+metrics:
+- f1
+- roc_auc
+- precision
+- recall
+pipeline_tag: text-classification
+tags:
+- rubert
+- sentiment
+datasets:
+- sismetanin/rureviews
+- RuSentiment
+- LinisCrowd2015
+- LinisCrowd2016
+- KaggleRussianNews
+---
+This is [RuBERT-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) model fine-tuned for __sentiment classification__ of short __Russian__ texts.
+The task is a __multi-class classification__ with the following labels:
+```yaml
+0: neutral
+1: positive
+2: negative
+```
+## Usage
+```python
+from transformers import pipeline
+model = pipeline(model="seara/rubert-tiny2-russian-sentiment")
+model("Привет, ты мне нравишься!")
+# [{'label': 'positive', 'score': 0.9398769736289978}]
+```
+## Dataset
+This model was trained on the union of the following datasets:
+- Kaggle Russian News Dataset
+- Linis Crowd 2015
+- Linis Crowd 2016
+- RuReviews
+- RuSentiment
+An overview of the training data can be found on [S. Smetanin Github repository](https://github.com/sismetanin/sentiment-analysis-in-russian).
+__Download links for all Russian sentiment datasets collected by Smetanin can be found in this [repository](https://github.com/searayeah/russian-sentiment-emotions-datasets).__
+## Training
+Training were done in this [project](https://github.com/searayeah/vkr-bert) with this parameters:
+```yaml
+max_length: 512
+batch_size: 64
+optimizer: adam
+lr: 0.00001
+weight_decay: 0
+num_epochs: 5
+```
+Train/validation/test splits are 80%/10%/10%.
+## Eval results (on test split)
+|         |neutral|positive|negative|macro avg|weighted avg|
+|---------|-------|--------|--------|---------|------------|
+|precision|0.69   |0.83    |0.74    |0.75     |0.75        |
+|recall   |0.73   |0.82    |0.68    |0.75     |0.75        |
+|f1-score |0.71   |0.83    |0.71    |0.75     |0.75        |
+|support  |5196   |3831    |3599    |12626    |12626       |
+|auc-roc  |0.84   |0.95    |0.90    |0.90     |0.89        |