Recognai
/

selectra_medium

@@ -14,8 +14,8 @@ We release a `small` and `medium` version with the following configuration:
 | Model | Layers | Embedding/Hidden Size | Params | Vocab Size | Max Sequence Length | Cased |
 | --- | --- | --- | --- | --- | --- | --- |
-| SELECTRA small | 12 | 256 | 22M | 50k | 512 | True |
-| **SELECTRA medium** | **12** | **384** | **41M** | **50k** | **512** | **True** |
 Selectra small (medium) is about 5 (3) times smaller than BETO but achieves comparable results (see Metrics section below).
@@ -27,8 +27,8 @@ The discriminator should therefore activate the logit corresponding to the fake
 ```python
 from transformers import ElectraForPreTraining, ElectraTokenizerFast
-discriminator = ElectraForPreTraining.from_pretrained("Recognai/selectra_medium")
-tokenizer = ElectraTokenizerFast.from_pretrained("Recognai/selectra_medium")
 sentence_with_fake_token = "Estamos desayunando pan rosa con tomate y aceite de oliva."
@@ -39,13 +39,15 @@ print("\t".join(tokenizer.tokenize(sentence_with_fake_token)))
 print("\t".join(map(lambda x: str(x)[:4], logits[1:-1])))
 """Output:
 Estamos desayun ##ando pan rosa con tomate y aceite de oliva .
--2.2 -1.9 -6.4 -2.0 -0.6 -4.3 -3.2 -4.9 -5.5 -7.2 -4.5 -4.0
 """
 ```
-However, you probably want to use this model to fine-tune it on a down-stream task.
-- Links to our zero-shot-classifiers
 ## Metrics
@@ -59,7 +61,7 @@ We fine-tune our models on 4 different down-stream tasks:
 For each task, we conduct 5 trials and state the mean and standard deviation of the metrics in the table below.
 To compare our results to other Spanish language models, we provide the same metrics taken from [Table 4](https://huggingface.co/bertin-project/bertin-roberta-base-spanish#results) of the Bertin-project model card.
-| Model | CoNLL2002 - POS (acc) | CoNLL2002 - NER (f1) | PAWS-X (acc) | XNLI (acc) | Params |
 | --- | --- | --- | --- | --- | --- |
 | SELECTRA small | 0.9653 +- 0.0007 | 0.863 +- 0.004 | 0.896 +- 0.002 | 0.784 +- 0.002 | **22M** |
 | SELECTRA medium | 0.9677 +- 0.0004 | 0.870 +- 0.003 | 0.896 +- 0.002 | **0.804 +- 0.002** | 41M |

 | Model | Layers | Embedding/Hidden Size | Params | Vocab Size | Max Sequence Length | Cased |
 | --- | --- | --- | --- | --- | --- | --- |
+| [SELECTRA small](https://huggingface.co/Recognai/selectra_small) | 12 | 256 | 22M | 50k | 512 | True |
+| **SELECTRA medium**] | **12** | **384** | **41M** | **50k** | **512** | **True** |
 Selectra small (medium) is about 5 (3) times smaller than BETO but achieves comparable results (see Metrics section below).
 ```python
 from transformers import ElectraForPreTraining, ElectraTokenizerFast
+discriminator = ElectraForPreTraining.from_pretrained("Recognai/selectra_small")
+tokenizer = ElectraTokenizerFast.from_pretrained("Recognai/selectra_small")
 sentence_with_fake_token = "Estamos desayunando pan rosa con tomate y aceite de oliva."
 print("\t".join(map(lambda x: str(x)[:4], logits[1:-1])))
 """Output:
 Estamos desayun ##ando pan rosa con tomate y aceite de oliva .
+-3.1 -3.6 -6.9 -3.0 0.19 -4.5 -3.3 -5.1 -5.7 -7.7 -4.4 -4.2
 """
 ```
+However, you probably want to use this model to fine-tune it on a downstream task.
+We provide models fine-tuned on the [XNLI dataset](https://huggingface.co/datasets/xnli), which can be used together with the zero-shot classification pipeline:
+- [Zero-shot SELECTRA small](https://huggingface.co/Recognai/zeroshot_selectra_small)
+- [Zero-shot SELECTRA medium](https://huggingface.co/Recognai/zeroshot_selectra_medium)
 ## Metrics
 For each task, we conduct 5 trials and state the mean and standard deviation of the metrics in the table below.
 To compare our results to other Spanish language models, we provide the same metrics taken from [Table 4](https://huggingface.co/bertin-project/bertin-roberta-base-spanish#results) of the Bertin-project model card.
+| Model | [CoNLL2002](https://huggingface.co/datasets/conll2002) - POS (acc) | [CoNLL2002](https://huggingface.co/datasets/conll2002) - NER (f1) | [PAWS-X](https://huggingface.co/datasets/paws-x) (acc) | [XNLI](https://huggingface.co/datasets/xnli) (acc) | Params |
 | --- | --- | --- | --- | --- | --- |
 | SELECTRA small | 0.9653 +- 0.0007 | 0.863 +- 0.004 | 0.896 +- 0.002 | 0.784 +- 0.002 | **22M** |
 | SELECTRA medium | 0.9677 +- 0.0004 | 0.870 +- 0.003 | 0.896 +- 0.002 | **0.804 +- 0.002** | 41M |