Danish BERT fine-tuned for Sentiment Analysis with senda
This model detects polarity ('positive', 'neutral', 'negative') of Danish texts.
It is trained and tested on Tweets annotated by Alexandra Institute. The model is trained with the senda
package.
Here is an example of how to load the model in PyTorch using the 🤗Transformers library:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
tokenizer = AutoTokenizer.from_pretrained("pin/senda")
model = AutoModelForSequenceClassification.from_pretrained("pin/senda")
# create 'senda' sentiment analysis pipeline
senda_pipeline = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)
text = "Sikke en dejlig dag det er i dag"
# in English: 'what a lovely day'
senda_pipeline(text)
Performance
The senda
model achieves an accuracy of 0.77 and a macro-averaged F1-score of 0.73 on a small test data set, that Alexandra Institute provides. The model can most certainly be improved, and we encourage all NLP-enthusiasts to give it their best shot - you can use the senda
package to do this.
Contact
Feel free to contact author Lars Kjeldgaard on [email protected].
Shout-outs
Props to Malte Højmark-Berthelsen for pretraining Danish BERT and helping out adding a TensorFlow backend for senda
.
- Downloads last month
- 239