sileod commited on
Commit
5603a1a
1 Parent(s): caeff28

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -10
README.md CHANGED
@@ -229,15 +229,7 @@ library_name: transformers
229
 
230
  DeBERTa-v3-base fine-tuned with multi-task learning on 560 tasks of the [tasksource collection](https://github.com/sileod/tasksource/)
231
  This checkpoint has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI), and can be used for zero-shot NLI pipeline (similar to bart-mnli but better).
232
- You can also further fine-tune this model to use it for any classification or multiple-choice task.
233
-
234
- This is the shared model with the MNLI classifier on top. Its encoder was trained on many datasets including bigbench, Anthropic rlhf, anli... alongside many NLI and classification tasks with one shared encoder.
235
- Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
236
- The number of examples per task was capped to 64k. The model was trained for 45k steps with a batch size of 384, and a peak learning rate of 2e-5.
237
-
238
- The list of tasks is available in model config.
239
-
240
- tasksource training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
241
 
242
  # Tasksource-adapters: 1 line access to 500 tasks
243
 
@@ -265,11 +257,17 @@ Results:
265
 
266
  For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)
267
 
268
- ### Software
269
  https://github.com/sileod/tasksource/ \
270
  https://github.com/sileod/tasknet/ \
271
  Training took 7 days on RTX6000 24GB gpu.
 
 
 
 
 
272
 
 
273
 
274
  # Citation
275
 
 
229
 
230
  DeBERTa-v3-base fine-tuned with multi-task learning on 560 tasks of the [tasksource collection](https://github.com/sileod/tasksource/)
231
  This checkpoint has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI), and can be used for zero-shot NLI pipeline (similar to bart-mnli but better).
232
+ You can also load other tasks (see next paragraph) or further fine-tune the encoder for new classification, token or multiple-choice.
 
 
 
 
 
 
 
 
233
 
234
  # Tasksource-adapters: 1 line access to 500 tasks
235
 
 
257
 
258
  For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)
259
 
260
+ ### Software and training details
261
  https://github.com/sileod/tasksource/ \
262
  https://github.com/sileod/tasknet/ \
263
  Training took 7 days on RTX6000 24GB gpu.
264
+ Training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
265
+
266
+ This is the shared model with the MNLI classifier on top. Its encoder was trained on many datasets including bigbench, Anthropic rlhf, anli... alongside many NLI and classification tasks with one shared encoder.
267
+ Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
268
+ The number of examples per task was capped to 64k. The model was trained for 45k steps with a batch size of 384, and a peak learning rate of 2e-5.
269
 
270
+ The list of tasks is available in model config.
271
 
272
  # Citation
273