emilys commited on
Commit
18870d6
1 Parent(s): 08af08f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-classification
6
+ tags:
7
+ - RoBERTa-large
8
+ - topic
9
+ - news
10
+ ---
11
+
12
+ # Fine-tuned RoBERTa-large for detecting news on government regulation
13
+
14
+ # Model Description
15
+
16
+ This model is a finetuned RoBERTa-large, for classifying whether news articles are about government regulation.
17
+
18
+ # How to Use
19
+
20
+ ```python
21
+ from transformers import pipeline
22
+ classifier = pipeline("text-classification", model="dell-research-harvard/topic-govt_regulation")
23
+ classifier("Senate passes gun control bill")
24
+ ```
25
+
26
+ # Training data
27
+
28
+ The model was trained on a hand-labelled sample of data from the [NEWSWIRE dataset](https://huggingface.co/datasets/dell-research-harvard/newswire).
29
+
30
+ Split|Size
31
+ -|-
32
+ Train|612
33
+ Dev|131
34
+ Test|131
35
+
36
+ # Test set results
37
+
38
+ Metric|Result
39
+ -|-
40
+ F1|0.8750
41
+ Accuracy|0.9237
42
+ Precision|0.7955
43
+ Recall|0.9722
44
+
45
+
46
+ # Citation Information
47
+
48
+ You can cite this dataset using
49
+
50
+ ```
51
+ @misc{silcock2024newswirelargescalestructureddatabase,
52
+ title={Newswire: A Large-Scale Structured Database of a Century of Historical News},
53
+ author={Emily Silcock and Abhishek Arora and Luca D'Amico-Wong and Melissa Dell},
54
+ year={2024},
55
+ eprint={2406.09490},
56
+ archivePrefix={arXiv},
57
+ primaryClass={cs.CL},
58
+ url={https://arxiv.org/abs/2406.09490},
59
+ }
60
+ ```
61
+
62
+ # Applications
63
+
64
+ We applied this model to a century of historical news articles. You can see all the classifications in the [NEWSWIRE dataset](https://huggingface.co/datasets/dell-research-harvard/newswire).
65
+
66
+