Commit
·
403444f
1
Parent(s):
75d08f1
Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ library_name: transformers
|
|
9 |
license: mit
|
10 |
---
|
11 |
|
12 |
-
|
13 |
The model is designed for zero-shot classification with the Hugging Face pipeline.
|
14 |
|
15 |
The model can do one universal task: determine whether a hypothesis is `true` or `not_true`
|
@@ -17,10 +17,12 @@ given a text (also called `entailment` vs. `not_entailment`).
|
|
17 |
This task format is based on the Natural Language Inference task (NLI).
|
18 |
The task is so universal that any classification task can be reformulated into the task.
|
19 |
|
|
|
|
|
20 |
## Training data
|
21 |
-
The model was trained on a mixture of
|
22 |
1. Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling"
|
23 |
-
2. 28 classification tasks
|
24 |
'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes',
|
25 |
'emotiondair', 'emocontext', 'empathetic',
|
26 |
'financialphrasebank', 'banking77', 'massive',
|
@@ -35,6 +37,10 @@ See details on each dataset here: https://github.com/MoritzLaurer/zeroshot-class
|
|
35 |
Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `not_entailment`)
|
36 |
as opposed to three classes (entailment/neutral/contradiction)
|
37 |
|
|
|
|
|
|
|
|
|
38 |
|
39 |
### How to use the model
|
40 |
#### Simple zero-shot classification pipeline
|
|
|
9 |
license: mit
|
10 |
---
|
11 |
|
12 |
+
# Model description: deberta-v3-large-zeroshot-v1.1-all-33
|
13 |
The model is designed for zero-shot classification with the Hugging Face pipeline.
|
14 |
|
15 |
The model can do one universal task: determine whether a hypothesis is `true` or `not_true`
|
|
|
17 |
This task format is based on the Natural Language Inference task (NLI).
|
18 |
The task is so universal that any classification task can be reformulated into the task.
|
19 |
|
20 |
+
A detailed description of how the model was trained and how it can be used is available in this paper: [link to be added]
|
21 |
+
|
22 |
## Training data
|
23 |
+
The model was trained on a mixture of __33 datasets and 387 classes__ that have been reformatted into this universal format.
|
24 |
1. Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling"
|
25 |
+
2. 28 classification tasks reformatted into the universal NLI format. ~51k cleaned texts were used to avoid overfitting:
|
26 |
'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes',
|
27 |
'emotiondair', 'emocontext', 'empathetic',
|
28 |
'financialphrasebank', 'banking77', 'massive',
|
|
|
37 |
Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `not_entailment`)
|
38 |
as opposed to three classes (entailment/neutral/contradiction)
|
39 |
|
40 |
+
The model was only trained on English data. For __multilingual use-cases__,
|
41 |
+
I recommend machine translating texts to English with libraries like [EasyNMT](https://github.com/UKPLab/EasyNMT).
|
42 |
+
English-only models tend to perform better than multilingual models and
|
43 |
+
validation with English data can be easier if you don't speak all languages in your corpus.
|
44 |
|
45 |
### How to use the model
|
46 |
#### Simple zero-shot classification pipeline
|