cointegrated
/

rubert-tiny2-cedr-emotion-detection

Text Classification

emotion-classification

Inference Endpoints

Model card Files Files and versions Community

rubert-tiny2-cedr-emotion-detection / README.md

cointegrated's picture

Update README.md

cd3543a about 3 years ago

|

history blame contribute delete

1.87 kB

	---
	language: ["ru"]
	tags:
	- russian
	- classification
	- sentiment
	- emotion-classification
	- multiclass
	datasets:
	- cedr
	widget:
	- text: "Бесишь меня, падла"
	- text: "Как здорово, что все мы здесь сегодня собрались"
	- text: "Как-то стрёмно, давай свалим отсюда?"
	- text: "Грусть-тоска меня съедает"
	- text: "Данный фрагмент текста не содержит абсолютно никаких эмоций"
	- text: "Нифига себе, неужели так тоже бывает!"

	---
	This is the [cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) model fine-tuned for classification of emotions in Russian sentences. The task is multilabel classification, because one sentence can contain multiple emotions.

	The model on the [CEDR dataset](https://huggingface.co/datasets/cedr) described in the paper ["Data-Driven Model for Emotion Detection in Russian Texts"](https://doi.org/10.1016/j.procs.2021.06.075) by Sboev et al.

	The model has been trained with Adam optimizer for 40 epochs with learning rate `1e-5` and batch size 64 [in this notebook](https://colab.research.google.com/drive/1AFW70EJaBn7KZKRClDIdDUpbD46cEsat?usp=sharing).

	The quality of the predicted probabilities on the test dataset is the following:

	\| label \| no emotion \| joy \|sadness \|surprise\| fear \|anger \| mean \| mean (emotions) \|
	\|----------\|------------\|--------\|--------\|--------\|--------\|--------\| --------\| ----------------\|
	\| AUC \| 0.9286 \| 0.9512 \| 0.9564 \| 0.8908 \| 0.8955 \| 0.7511 \| 0.8956 \| 0.8890 \|
	\| F1 micro \| 0.8624 \| 0.9389 \| 0.9362 \| 0.9469 \| 0.9575 \| 0.9261 \| 0.9280 \| 0.9411 \|
	\| F1 macro \| 0.8562 \| 0.8962 \| 0.9017 \| 0.8366 \| 0.8359 \| 0.6820 \| 0.8348 \| 0.8305 \|