David commited on
Commit
296e183
1 Parent(s): b6d9472

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -8
README.md CHANGED
@@ -14,8 +14,8 @@ We release a `small` and `medium` version with the following configuration:
14
 
15
  | Model | Layers | Embedding/Hidden Size | Params | Vocab Size | Max Sequence Length | Cased |
16
  | --- | --- | --- | --- | --- | --- | --- |
17
- | SELECTRA small | 12 | 256 | 22M | 50k | 512 | True |
18
- | **SELECTRA medium** | **12** | **384** | **41M** | **50k** | **512** | **True** |
19
 
20
  Selectra small (medium) is about 5 (3) times smaller than BETO but achieves comparable results (see Metrics section below).
21
 
@@ -27,8 +27,8 @@ The discriminator should therefore activate the logit corresponding to the fake
27
  ```python
28
  from transformers import ElectraForPreTraining, ElectraTokenizerFast
29
 
30
- discriminator = ElectraForPreTraining.from_pretrained("Recognai/selectra_medium")
31
- tokenizer = ElectraTokenizerFast.from_pretrained("Recognai/selectra_medium")
32
 
33
  sentence_with_fake_token = "Estamos desayunando pan rosa con tomate y aceite de oliva."
34
 
@@ -39,13 +39,15 @@ print("\t".join(tokenizer.tokenize(sentence_with_fake_token)))
39
  print("\t".join(map(lambda x: str(x)[:4], logits[1:-1])))
40
  """Output:
41
  Estamos desayun ##ando pan rosa con tomate y aceite de oliva .
42
- -2.2 -1.9 -6.4 -2.0 -0.6 -4.3 -3.2 -4.9 -5.5 -7.2 -4.5 -4.0
43
  """
44
  ```
45
 
46
- However, you probably want to use this model to fine-tune it on a down-stream task.
 
47
 
48
- - Links to our zero-shot-classifiers
 
49
 
50
  ## Metrics
51
 
@@ -59,7 +61,7 @@ We fine-tune our models on 4 different down-stream tasks:
59
  For each task, we conduct 5 trials and state the mean and standard deviation of the metrics in the table below.
60
  To compare our results to other Spanish language models, we provide the same metrics taken from [Table 4](https://huggingface.co/bertin-project/bertin-roberta-base-spanish#results) of the Bertin-project model card.
61
 
62
- | Model | CoNLL2002 - POS (acc) | CoNLL2002 - NER (f1) | PAWS-X (acc) | XNLI (acc) | Params |
63
  | --- | --- | --- | --- | --- | --- |
64
  | SELECTRA small | 0.9653 +- 0.0007 | 0.863 +- 0.004 | 0.896 +- 0.002 | 0.784 +- 0.002 | **22M** |
65
  | SELECTRA medium | 0.9677 +- 0.0004 | 0.870 +- 0.003 | 0.896 +- 0.002 | **0.804 +- 0.002** | 41M |
 
14
 
15
  | Model | Layers | Embedding/Hidden Size | Params | Vocab Size | Max Sequence Length | Cased |
16
  | --- | --- | --- | --- | --- | --- | --- |
17
+ | [SELECTRA small](https://huggingface.co/Recognai/selectra_small) | 12 | 256 | 22M | 50k | 512 | True |
18
+ | **SELECTRA medium**] | **12** | **384** | **41M** | **50k** | **512** | **True** |
19
 
20
  Selectra small (medium) is about 5 (3) times smaller than BETO but achieves comparable results (see Metrics section below).
21
 
 
27
  ```python
28
  from transformers import ElectraForPreTraining, ElectraTokenizerFast
29
 
30
+ discriminator = ElectraForPreTraining.from_pretrained("Recognai/selectra_small")
31
+ tokenizer = ElectraTokenizerFast.from_pretrained("Recognai/selectra_small")
32
 
33
  sentence_with_fake_token = "Estamos desayunando pan rosa con tomate y aceite de oliva."
34
 
 
39
  print("\t".join(map(lambda x: str(x)[:4], logits[1:-1])))
40
  """Output:
41
  Estamos desayun ##ando pan rosa con tomate y aceite de oliva .
42
+ -3.1 -3.6 -6.9 -3.0 0.19 -4.5 -3.3 -5.1 -5.7 -7.7 -4.4 -4.2
43
  """
44
  ```
45
 
46
+ However, you probably want to use this model to fine-tune it on a downstream task.
47
+ We provide models fine-tuned on the [XNLI dataset](https://huggingface.co/datasets/xnli), which can be used together with the zero-shot classification pipeline:
48
 
49
+ - [Zero-shot SELECTRA small](https://huggingface.co/Recognai/zeroshot_selectra_small)
50
+ - [Zero-shot SELECTRA medium](https://huggingface.co/Recognai/zeroshot_selectra_medium)
51
 
52
  ## Metrics
53
 
 
61
  For each task, we conduct 5 trials and state the mean and standard deviation of the metrics in the table below.
62
  To compare our results to other Spanish language models, we provide the same metrics taken from [Table 4](https://huggingface.co/bertin-project/bertin-roberta-base-spanish#results) of the Bertin-project model card.
63
 
64
+ | Model | [CoNLL2002](https://huggingface.co/datasets/conll2002) - POS (acc) | [CoNLL2002](https://huggingface.co/datasets/conll2002) - NER (f1) | [PAWS-X](https://huggingface.co/datasets/paws-x) (acc) | [XNLI](https://huggingface.co/datasets/xnli) (acc) | Params |
65
  | --- | --- | --- | --- | --- | --- |
66
  | SELECTRA small | 0.9653 +- 0.0007 | 0.863 +- 0.004 | 0.896 +- 0.002 | 0.784 +- 0.002 | **22M** |
67
  | SELECTRA medium | 0.9677 +- 0.0004 | 0.870 +- 0.003 | 0.896 +- 0.002 | **0.804 +- 0.002** | 41M |