Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: en
|
3 |
+
|
4 |
+
license: mit
|
5 |
+
|
6 |
+
widget:
|
7 |
+
- text: "Paris is the <mask> of France."
|
8 |
+
example_title: "Capital"
|
9 |
+
- text: "The goal of life is <mask>."
|
10 |
+
example_title: "Philosophy"
|
11 |
+
---
|
12 |
+
|
13 |
+
# roberta-news
|
14 |
+
|
15 |
+
## Model Description
|
16 |
+
The model is [roberta-base](https://huggingface.co/roberta-base) fine-tuned to unmask news.
|
17 |
+
|
18 |
+
## Training Data
|
19 |
+
The model's training data consists of almost 13,000,000 English articles from ~90 outlets, which each consists of a headline (title) and a subheading (description). The articles were collected from the [Sciride News Mine](http://sciride.org/news.html), after which some additional cleaning was performed on the data, such as removing duplicate articles and removing repeated "outlet tags" appearing before or after headlines such as "| Daily Mail Online".
|
20 |
+
|
21 |
+
The cleaned dataset can be found on huggingface [here](https://huggingface.co/datasets/AndyReas/frontpage-news). roberta-gen-news was pre-trained on a large subset (12,928,029 / 13,219,867) of the linked dataset, after repacking the data a bit to avoid abrupt truncation.
|
22 |
+
|
23 |
+
## How to use
|
24 |
+
The model can be used with the HuggingFace pipeline like so:
|
25 |
+
```python
|
26 |
+
>>> from transformers import pipeline
|
27 |
+
>>> unmasker = pipeline('fill-mask', model='andyreas/roberta-gen-news')
|
28 |
+
>>> print(unmasker("The weather forecast for <mask> is rain.", top_k=5))
|
29 |
+
|
30 |
+
[{'score': 0.06107175350189209,
|
31 |
+
'token': 1083,
|
32 |
+
'token_str': ' Friday',
|
33 |
+
'sequence': 'The weather forecast for Friday is rain.'},
|
34 |
+
{'score': 0.04649643227458,
|
35 |
+
'token': 1359,
|
36 |
+
'token_str': ' Saturday',
|
37 |
+
'sequence': 'The weather forecast for Saturday is rain.'
|
38 |
+
},
|
39 |
+
{'score': 0.04370906576514244,
|
40 |
+
'token': 1772,
|
41 |
+
'token_str': ' weekend',
|
42 |
+
'sequence': 'The weather forecast for weekend is rain.'},
|
43 |
+
{'score': 0.04101456701755524,
|
44 |
+
'token': 1133,
|
45 |
+
'token_str': ' Wednesday',
|
46 |
+
'sequence': 'The weather forecast for Wednesday is rain.'},
|
47 |
+
{'score': 0.03785591572523117,
|
48 |
+
'token': 1234,
|
49 |
+
'token_str': ' Sunday',
|
50 |
+
'sequence': 'The weather forecast for Sunday is rain.'}]
|
51 |
+
```
|
52 |
+
|
53 |
+
## Training
|
54 |
+
Training ran for 1 epoch using a learning rate of 2e-6 and 50K warm-up steps out of ~800K total steps.
|
55 |
+
|
56 |
+
## Bias
|
57 |
+
Like any other model, roberta-gen-news is subject to bias according to the data it was trained on.
|