Jean-Baptiste
/

roberta-large-financial-news-sentiment-en

Text Classification

Inference Endpoints

Model card Files Files and versions Community

Jean-Baptiste commited on Dec 29, 2022

Commit

e61d990

·

1 Parent(s): 30e27d7

Update README.md

Files changed (1) hide show

README.md +60 -1

README.md CHANGED Viewed

@@ -9,4 +9,63 @@ widget:
 - text: "Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power"
 license: mit
----

 - text: "Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power"
 license: mit
+---
+# Model fine-tuned from roberta-large for sentiment classification of financial news (emphasis on Canadian news).
+### Introduction
+This model was train on financial_news_sentiment_mixte_with_phrasebank_75 dataset.
+This is a customized version of the phrasebank dataset in which I kept only sentence validated by at least 75% annotators.
+In addition I added ~2000 articles validated manually on Canadian financial news. Therefore the model is more specifically trained for Canadian news.
+Final result is f1 score of 93.25% overall and 83.6% on Canadian news.
+### Training data
+Training data was classified as follow:
+class |Description
+-|-
+0 |negative
+1 |neutral
+2 |positive
+### How to use roberta-large-financial-news-sentiment-en with HuggingFace
+##### Load roberta-large-financial-news-sentiment-en and its sub-word tokenizer :
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
+model = AutoModelForSequenceClassification.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
+##### Process text sample (from wikipedia)
+from transformers import pipeline
+pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
+pipe("Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power")
+[{'label': 'negative', 'score': 0.9399105906486511}]
+```
+### Model performances
+Overall f1 score (average macro)
+precision|recall|f1
+-|-|-
+0.9355|0.9299|0.9325
+By entity
+entity|precision|recall|f1
+-|-|-|-
+negative|0.9605|0.9240|0.9419
+neutral|0.9538|0.9459|0.9498
+positive|0.8922|0.9200|0.9059