dstefa commited on
Commit
110ad28
1 Parent(s): 3102e25

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -8
README.md CHANGED
@@ -2,23 +2,43 @@
2
  license: mit
3
  base_model: roberta-base
4
  tags:
5
- - generated_from_trainer
 
 
 
6
  metrics:
7
  - accuracy
8
  - f1
9
  - precision
10
  - recall
 
 
 
 
 
 
11
  model-index:
12
  - name: roberta-base_topic_classification_nyt_news
13
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ---
15
 
16
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
- should probably proofread and complete it, then remove this comment. -->
18
-
19
  # roberta-base_topic_classification_nyt_news
20
 
21
- This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
  - Loss: 0.3797
24
  - Accuracy: 0.9094
@@ -34,9 +54,19 @@ More information needed
34
 
35
  More information needed
36
 
37
- ## Training and evaluation data
 
38
 
39
- More information needed
 
 
 
 
 
 
 
 
 
40
 
41
  ## Training procedure
42
 
@@ -62,6 +92,39 @@ The following hyperparameters were used during training:
62
  | 0.1239 | 4.0 | 81920 | 0.3981 | 0.9117 | 0.9113 | 0.9114 | 0.9117 |
63
  | 0.1472 | 5.0 | 102400 | 0.4033 | 0.9137 | 0.9135 | 0.9134 | 0.9137 |
64
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
 
66
  ### Framework versions
67
 
 
2
  license: mit
3
  base_model: roberta-base
4
  tags:
5
+ - topic
6
+ - classification
7
+ - news
8
+ - roberta
9
  metrics:
10
  - accuracy
11
  - f1
12
  - precision
13
  - recall
14
+ datasets:
15
+ - dstefa/New_York_Times_Topics
16
+ widget:
17
+ - text: >-
18
+ Olympic champion Kostas Kederis today left hospital ahead of his date with IOC inquisitors claiming his innocence and vowing.
19
+ example_title: Analyst Update'
20
  model-index:
21
  - name: roberta-base_topic_classification_nyt_news
22
+ results:
23
+ - task:
24
+ name: Text Classification
25
+ type: text-classification
26
+ dataset:
27
+ name: New_York_Times_Topics
28
+ type: News
29
+ metrics:
30
+ - type: F1
31
+ name: F1
32
+ value: 0.910647
33
+ - type: accuracy
34
+ name: accuracy
35
+ value: 0.910615
36
+ pipeline_tag: text-classification
37
  ---
38
 
 
 
 
39
  # roberta-base_topic_classification_nyt_news
40
 
41
+ This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the NYT News dataset (https://www.kaggle.com/datasets/aryansingh0909/nyt-articles-21m-2000-present).
42
  It achieves the following results on the evaluation set:
43
  - Loss: 0.3797
44
  - Accuracy: 0.9094
 
54
 
55
  More information needed
56
 
57
+ ## Training data
58
+ Training data was classified as follow:
59
 
60
+ class |Description
61
+ -|-
62
+ 0 |Sports
63
+ 1 |Arts, Culture, and Entertainment
64
+ 2 |Business and Finance
65
+ 3 |Health and Wellness
66
+ 4 |Lifestyle and Fashion
67
+ 5 |Science and Technology
68
+ 6 |Politics
69
+ 7 |Crime
70
 
71
  ## Training procedure
72
 
 
92
  | 0.1239 | 4.0 | 81920 | 0.3981 | 0.9117 | 0.9113 | 0.9114 | 0.9117 |
93
  | 0.1472 | 5.0 | 102400 | 0.4033 | 0.9137 | 0.9135 | 0.9134 | 0.9137 |
94
 
95
+ ### Model performances
96
+
97
+ -|precision|recall|f1|support
98
+ -|-|-|-|-
99
+ Sports|0.97|0.98|0.97|6400
100
+ Arts, Culture, and Entertainment|0.94|0.95|0.94|6400
101
+ Business and Finance|0.85|0.84|0.84|6400
102
+ Health and Wellness|0.90|0.93|0.91|6400
103
+ Lifestyle and Fashion|0.95|0.95|0.95|6400
104
+ Science and Technology|0.89|0.83|0.86|6400
105
+ Politics|0.93|0.88|0.90|6400
106
+ Crime|0.85|0.93|0.89|6400
107
+ | | | |
108
+ accuracy|||0.91|51200
109
+ macro avg|0.91|0.91|0.91|51200
110
+ weighted avg|0.91|0.91|0.91|51200
111
+
112
+ ### How to use roberta-base_topic_classification_nyt_news with HuggingFace
113
+
114
+ ```python
115
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
116
+ from transformers import pipeline
117
+
118
+ tokenizer = AutoTokenizer.from_pretrained("dstefa/roberta-base_topic_classification_nyt_news")
119
+ model = AutoModelForSequenceClassification.from_pretrained("dstefa/roberta-base_topic_classification_nyt_news")
120
+ pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
121
+
122
+ text = "Kederis proclaims innocence Olympic champion Kostas Kederis today left hospital ahead of his date with IOC inquisitors claiming his innocence and vowing."
123
+ pipe(text)
124
+
125
+ [{'label': 'Sports', 'score': 0.9989326596260071}]
126
+
127
+ ```
128
 
129
  ### Framework versions
130