Model Card
This model is trained to predict whether two given messages from some group chat with many members can have a reply_to
relationship.
Training details
It's based on Conversational RuBERT (cased, 12-layer, 768-hidden, 12-heads, 180M parameters) that was trained on several social media datasets. We fine-tuned it with the data from several Telegram chats. The positive reply_to
examples were obtained by natural user annotation. The negative ones were obtained by shuffling the messages.
The task perfectly aligns with the Next Sentence Prediction task, so the fine-tuning was done in that manner.
It achieves the 0.83 F1 score on the gold test set from our reply recovery dataset.
See the paper for more details.
Usage
Note: if two messages have reply_to
relationship, then they have "zero" label. This is because of the NSP formulation.
from transformers import AutoTokenizer, BertForNextSentencePrediction
tokenizer = AutoTokenizer.from_pretrained("astromis/rubert_reply_recovery", )
model = BertForNextSentencePrediction.from_pretrained("rubert_reply_recovery", )
inputs = tokenizer(['Где можно получить СНИЛС?', 'Я тут уже много лет'], ["Можете в МФЦ", "Куда отправить это письмо?"], return_tensors='pt',
truncation=True, max_length=512, padding = 'max_length',)
output = model(**inputs)
print(output.logits.argmax(dim=1))
# tensor([0, 1])
Citation
@article{Buyanov2023WhoIA,
title={Who is answering to whom? Modeling reply-to relationships in Russian asynchronous chats},
author={Igor Buyanov and Darya Yaskova and Ilya Sochenkov},
journal={Computational Linguistics and Intellectual Technologies},
year={2023}
}
- Downloads last month
- 12