yanaiela commited on
Commit
4082e39
·
1 Parent(s): 322b30b

readme file

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ tags:
4
+ - roberta-base
5
+ - roberta-base-epoch_0
6
+ license: mit
7
+ ---
8
+
9
+ # RoBERTa, Intermediate Checkpoint - Epoch 0
10
+
11
+ This model is part of our reimplementation of the [RoBERTa model](https://arxiv.org/abs/1907.11692),
12
+ trained on Wikipedia and the Book Corpus only.
13
+ We train this model for almost 100K steps, corresponding to 83 epochs.
14
+ We provide the 84 checkpoints (including the randomly initialized weights before the training)
15
+ to provide the ability to study the training dynamics of such models, and other possible use-cases.
16
+
17
+ These models were trained in part of a work that studies how simple statistics from data,
18
+ such as co-occurrences affects model predictions, which are described in the paper
19
+ [Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions](https://arxiv.org/abs/2207.14251).
20
+
21
+ This is model 0.
22
+
23
+ ## Model Description
24
+
25
+ This model was captured during a reproduction of
26
+ [RoBERTa-base](https://huggingface.co/roberta-base), for English: it
27
+ is a Transformers model pretrained on a large corpus of English data, using the
28
+ Masked Language Modelling (MLM).
29
+
30
+ The intended uses, limitations, training data and training procedure for the fully trained model are similar
31
+ to [RoBERTa-base](https://huggingface.co/roberta-base). Two major
32
+ differences with the original model:
33
+
34
+ * We trained our model for 100K steps, instead of 500K
35
+ * We only use Wikipedia and the Book Corpus, as corpora which are publicly available.
36
+
37
+
38
+ ### How to use
39
+
40
+ Using code from
41
+ [RoBERTa-base](https://huggingface.co/roberta-base), here is an example based on
42
+ PyTorch:
43
+
44
+ ```
45
+ from transformers import pipeline
46
+
47
+ model = pipeline("fill-mask", model='yanaiela/roberta-base-epoch_83', device=-1, top_k=10)
48
+ model("Hello, I'm the <mask> RoBERTa-base language model")
49
+
50
+ ```
51
+
52
+ ## Citation info
53
+
54
+ ```bibtex
55
+ @article{2207.14251,
56
+ Author = {Yanai Elazar and Nora Kassner and Shauli Ravfogel and Amir Feder and Abhilasha Ravichander and Marius Mosbach and Yonatan Belinkov and Hinrich Schütze and Yoav Goldberg},
57
+ Title = {Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions},
58
+ Year = {2022},
59
+ Eprint = {arXiv:2207.14251},
60
+ }
61
+ ```