MoritzLaurer HF staff commited on
Commit
403444f
·
1 Parent(s): 75d08f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -3
README.md CHANGED
@@ -9,7 +9,7 @@ library_name: transformers
9
  license: mit
10
  ---
11
 
12
- ## Model description: deberta-v3-large-zeroshot-v1.1-all-33
13
  The model is designed for zero-shot classification with the Hugging Face pipeline.
14
 
15
  The model can do one universal task: determine whether a hypothesis is `true` or `not_true`
@@ -17,10 +17,12 @@ given a text (also called `entailment` vs. `not_entailment`).
17
  This task format is based on the Natural Language Inference task (NLI).
18
  The task is so universal that any classification task can be reformulated into the task.
19
 
 
 
20
  ## Training data
21
- The model was trained on a mixture of 33 datasets and 389 classes that have been reformatted into this universal format.
22
  1. Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling"
23
- 2. 28 classification tasks with ~51k texts:
24
  'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes',
25
  'emotiondair', 'emocontext', 'empathetic',
26
  'financialphrasebank', 'banking77', 'massive',
@@ -35,6 +37,10 @@ See details on each dataset here: https://github.com/MoritzLaurer/zeroshot-class
35
  Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `not_entailment`)
36
  as opposed to three classes (entailment/neutral/contradiction)
37
 
 
 
 
 
38
 
39
  ### How to use the model
40
  #### Simple zero-shot classification pipeline
 
9
  license: mit
10
  ---
11
 
12
+ # Model description: deberta-v3-large-zeroshot-v1.1-all-33
13
  The model is designed for zero-shot classification with the Hugging Face pipeline.
14
 
15
  The model can do one universal task: determine whether a hypothesis is `true` or `not_true`
 
17
  This task format is based on the Natural Language Inference task (NLI).
18
  The task is so universal that any classification task can be reformulated into the task.
19
 
20
+ A detailed description of how the model was trained and how it can be used is available in this paper: [link to be added]
21
+
22
  ## Training data
23
+ The model was trained on a mixture of __33 datasets and 387 classes__ that have been reformatted into this universal format.
24
  1. Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling"
25
+ 2. 28 classification tasks reformatted into the universal NLI format. ~51k cleaned texts were used to avoid overfitting:
26
  'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes',
27
  'emotiondair', 'emocontext', 'empathetic',
28
  'financialphrasebank', 'banking77', 'massive',
 
37
  Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `not_entailment`)
38
  as opposed to three classes (entailment/neutral/contradiction)
39
 
40
+ The model was only trained on English data. For __multilingual use-cases__,
41
+ I recommend machine translating texts to English with libraries like [EasyNMT](https://github.com/UKPLab/EasyNMT).
42
+ English-only models tend to perform better than multilingual models and
43
+ validation with English data can be easier if you don't speak all languages in your corpus.
44
 
45
  ### How to use the model
46
  #### Simple zero-shot classification pipeline