MoritzLaurer HF staff commited on
Commit
63bbc62
·
1 Parent(s): e3f3825

Update README.md

Browse files

readme adapted to specific model

Files changed (1) hide show
  1. README.md +23 -25
README.md CHANGED
@@ -12,8 +12,6 @@ license: mit
12
  # deberta-v3-large-zeroshot-v1.1-all-33
13
  ## Model description
14
  The model is designed for zero-shot classification with the Hugging Face pipeline.
15
- The model should be substantially better at zero-shot classification than my other zero-shot models on the
16
- Hugging Face hub: https://huggingface.co/MoritzLaurer.
17
 
18
  The model can do one universal task: determine whether a hypothesis is `true` or `not_true`
19
  given a text (also called `entailment` vs. `not_entailment`).
@@ -21,17 +19,18 @@ This task format is based on the Natural Language Inference task (NLI).
21
  The task is so universal that any classification task can be reformulated into the task.
22
 
23
  ## Training data
24
- The model was trained on a mixture of 27 tasks and 310 classes that have been reformatted into this universal format.
25
- 1. 26 classification tasks with ~400k texts:
 
26
  'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes',
27
  'emotiondair', 'emocontext', 'empathetic',
28
  'financialphrasebank', 'banking77', 'massive',
29
  'wikitoxic_toxicaggregated', 'wikitoxic_obscene', 'wikitoxic_threat', 'wikitoxic_insult', 'wikitoxic_identityhate',
30
  'hateoffensive', 'hatexplain', 'biasframes_offensive', 'biasframes_sex', 'biasframes_intent',
31
  'agnews', 'yahootopics',
32
- 'trueteacher', 'spam', 'wellformedquery'.
33
- See details on each dataset here: https://docs.google.com/spreadsheets/d/1Z18tMh02IiWgh6o8pfoMiI_LH4IXpr78wd_nmNd5FaE/edit?usp=sharing
34
- 3. Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling"
35
 
36
  Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `not_entailment`)
37
  as opposed to three classes (entailment/neutral/contradiction)
@@ -41,10 +40,11 @@ as opposed to three classes (entailment/neutral/contradiction)
41
  #### Simple zero-shot classification pipeline
42
  ```python
43
  from transformers import pipeline
44
- classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v1")
45
- sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
46
- candidate_labels = ["politics", "economy", "entertainment", "environment"]
47
- output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
 
48
  print(output)
49
  ```
50
 
@@ -60,12 +60,10 @@ Please consult the original DeBERTa paper and the papers for the different datas
60
  The base model (DeBERTa-v3) is published under the MIT license.
61
  The datasets the model was fine-tuned on are published under a diverse set of licenses.
62
  The following spreadsheet provides an overview of the non-NLI datasets used for fine-tuning.
63
- The spreadsheets contains information on licenses, the underlying papers etc.: https://docs.google.com/spreadsheets/d/1Z18tMh02IiWgh6o8pfoMiI_LH4IXpr78wd_nmNd5FaE/edit?usp=sharing
64
-
65
- In addition, the model was also trained on the following NLI datasets: MNLI, ANLI, WANLI, LING-NLI, FEVER-NLI.
66
 
67
  ## Citation
68
- If you use this model, please cite:
69
  ```
70
  @article{laurer_less_2023,
71
  title = {Less {Annotating}, {More} {Classifying}: {Addressing} the {Data} {Scarcity} {Issue} of {Supervised} {Machine} {Learning} with {Deep} {Transfer} {Learning} and {BERT}-{NLI}},
@@ -87,25 +85,25 @@ If you use this model, please cite:
87
  If you have questions or ideas for cooperation, contact me at m{dot}laurer{at}vu{dot}nl or [LinkedIn](https://www.linkedin.com/in/moritz-laurer/)
88
 
89
  ### Debugging and issues
90
- Note that DeBERTa-v3 was released on 06.12.21 and older versions of HF Transformers seem to have issues running the model (e.g. resulting in an issue with the tokenizer). Using Transformers>=4.13 might solve some issues.
91
 
92
  ### Hypotheses used for classification
93
-
94
- These hypotheses were used to fine-tune the model.
95
- Inspecting them can help users get a feeling for which type of hypotheses and tasks the model was trained on.
96
- I recommend formulating your hypotheses in a similar format. For example:
97
 
98
  ```python
99
  from transformers import pipeline
100
  text = "Angela Merkel is a politician in Germany and leader of the CDU"
101
- classes_verbalized = ["politics", "economy", "entertainment", "environment"]
102
- hypothesis_template = "This example is about {}"
103
- model_name = "MoritzLaurer/deberta-v3-large-zeroshot-v1.1-all-33"
104
- classifier = pipeline("zero-shot-classification", model=model_name)
105
- output = classifier(text, classes_verbalised, hypothesis_template=hypothesis_template, multi_label=False)
106
  print(output)
107
  ```
108
 
 
 
109
 
110
  #### wellformedquery
111
  | label | hypothesis |
 
12
  # deberta-v3-large-zeroshot-v1.1-all-33
13
  ## Model description
14
  The model is designed for zero-shot classification with the Hugging Face pipeline.
 
 
15
 
16
  The model can do one universal task: determine whether a hypothesis is `true` or `not_true`
17
  given a text (also called `entailment` vs. `not_entailment`).
 
19
  The task is so universal that any classification task can be reformulated into the task.
20
 
21
  ## Training data
22
+ The model was trained on a mixture of 33 datasets and 389 classes that have been reformatted into this universal format.
23
+ 1. Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling"
24
+ 2. 28 classification tasks with ~51k texts:
25
  'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes',
26
  'emotiondair', 'emocontext', 'empathetic',
27
  'financialphrasebank', 'banking77', 'massive',
28
  'wikitoxic_toxicaggregated', 'wikitoxic_obscene', 'wikitoxic_threat', 'wikitoxic_insult', 'wikitoxic_identityhate',
29
  'hateoffensive', 'hatexplain', 'biasframes_offensive', 'biasframes_sex', 'biasframes_intent',
30
  'agnews', 'yahootopics',
31
+ 'trueteacher', 'spam', 'wellformedquery',
32
+ 'manifesto', 'capsotu'.
33
+ See details on each dataset here: https://github.com/MoritzLaurer/zeroshot-classifier/blob/main/datasets_overview.csv
34
 
35
  Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `not_entailment`)
36
  as opposed to three classes (entailment/neutral/contradiction)
 
40
  #### Simple zero-shot classification pipeline
41
  ```python
42
  from transformers import pipeline
43
+ text = "Angela Merkel is a politician in Germany and leader of the CDU"
44
+ hypothesis_template = "This example is about {}"
45
+ classes_verbalized = ["politics", "economy", "entertainment", "environment"]
46
+ zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v1.1-all-33")
47
+ output = zeroshot_classifier(text, classes_verbalised, hypothesis_template=hypothesis_template, multi_label=False)
48
  print(output)
49
  ```
50
 
 
60
  The base model (DeBERTa-v3) is published under the MIT license.
61
  The datasets the model was fine-tuned on are published under a diverse set of licenses.
62
  The following spreadsheet provides an overview of the non-NLI datasets used for fine-tuning.
63
+ The spreadsheets contains information on licenses, the underlying papers etc.: https://github.com/MoritzLaurer/zeroshot-classifier/blob/main/datasets_overview.csv
 
 
64
 
65
  ## Citation
66
+ If you use this model academically, please cite:
67
  ```
68
  @article{laurer_less_2023,
69
  title = {Less {Annotating}, {More} {Classifying}: {Addressing} the {Data} {Scarcity} {Issue} of {Supervised} {Machine} {Learning} with {Deep} {Transfer} {Learning} and {BERT}-{NLI}},
 
85
  If you have questions or ideas for cooperation, contact me at m{dot}laurer{at}vu{dot}nl or [LinkedIn](https://www.linkedin.com/in/moritz-laurer/)
86
 
87
  ### Debugging and issues
88
+ Note that DeBERTa-v3 was released on 06.12.21 and older versions of HF Transformers can have issues running the model (e.g. resulting in an issue with the tokenizer). Using Transformers>=4.13 might solve some issues.
89
 
90
  ### Hypotheses used for classification
91
+ The hypotheses in the tables below were used to fine-tune the model.
92
+ Inspecting them can help users get a feeling for which type of hypotheses and tasks the model was trained on.
93
+ You can formulate your own hypotheses by changing the `hypothesis_template` of the zeroshot pipeline. For example:
 
94
 
95
  ```python
96
  from transformers import pipeline
97
  text = "Angela Merkel is a politician in Germany and leader of the CDU"
98
+ hypothesis_template = "Merkel is the leader of the party: {}"
99
+ classes_verbalized = ["CDU", "SPD", "Greens"]
100
+ zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v1.1-all-33")
101
+ output = zeroshot_classifier(text, classes_verbalised, hypothesis_template=hypothesis_template, multi_label=False)
 
102
  print(output)
103
  ```
104
 
105
+ Note that a few rows in the `massive` and `banking77` datasets contain `nan` because some classes were so ambiguous/unclear that I excluded them from the data.
106
+
107
 
108
  #### wellformedquery
109
  | label | hypothesis |