Update README.md
Browse files
README.md
CHANGED
@@ -269,11 +269,10 @@ pipeline_tag: zero-shot-classification
|
|
269 |
|
270 |
# Model Card for DeBERTa-v3-base-tasksource-nli
|
271 |
|
272 |
-
This is [DeBERTa-v3-base](https://hf.co/microsoft/deberta-v3-base) fine-tuned with multi-task learning on 560 tasks of the [tasksource collection](https://github.com/sileod/tasksource/)
|
273 |
This checkpoint has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI), and can be used for:
|
274 |
-
- Natural language inference
|
275 |
- Zero-shot entailment-based classification pipeline (similar to bart-mnli), see [ZS].
|
276 |
-
- Many other tasks with tasksource-adapters, see [TA]
|
277 |
- Further fine-tune for new task (classification, token classification or multiple-choice).
|
278 |
|
279 |
# [ZS] Zero-shot classification pipeline
|
@@ -307,16 +306,15 @@ https://ibm.github.io/model-recycling/
|
|
307 |
https://github.com/sileod/tasksource/ \
|
308 |
https://github.com/sileod/tasknet/ \
|
309 |
Training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
|
310 |
-
Training took 7 days on RTX6000 24GB gpu.
|
311 |
|
312 |
-
This is the shared model with the MNLI classifier on top. Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
|
313 |
-
The number of examples per task was capped to 64k. The model was trained for 45k steps with a batch size of 384, and a peak learning rate of 2e-5.
|
314 |
|
|
|
|
|
315 |
|
316 |
# Citation
|
317 |
|
318 |
More details on this [article:](https://arxiv.org/abs/2301.05948)
|
319 |
-
```
|
320 |
@article{sileo2023tasksource,
|
321 |
title={tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation},
|
322 |
author={Sileo, Damien},
|
|
|
269 |
|
270 |
# Model Card for DeBERTa-v3-base-tasksource-nli
|
271 |
|
272 |
+
This is [DeBERTa-v3-base](https://hf.co/microsoft/deberta-v3-base) fine-tuned with multi-task learning on 560 tasks of the [tasksource collection](https://github.com/sileod/tasksource/).
|
273 |
This checkpoint has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI), and can be used for:
|
274 |
+
- Natural language inference, and many other tasks with tasksource-adapters, see [TA]
|
275 |
- Zero-shot entailment-based classification pipeline (similar to bart-mnli), see [ZS].
|
|
|
276 |
- Further fine-tune for new task (classification, token classification or multiple-choice).
|
277 |
|
278 |
# [ZS] Zero-shot classification pipeline
|
|
|
306 |
https://github.com/sileod/tasksource/ \
|
307 |
https://github.com/sileod/tasknet/ \
|
308 |
Training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
|
|
|
309 |
|
|
|
|
|
310 |
|
311 |
+
This is the shared model with the MNLI classifier on top. Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
|
312 |
+
The number of examples per task was capped to 64k. The model was trained for 100k steps with a batch size of 384, and a peak learning rate of 2e-5. Training took 7 days on RTX6000 24GB gpu.
|
313 |
|
314 |
# Citation
|
315 |
|
316 |
More details on this [article:](https://arxiv.org/abs/2301.05948)
|
317 |
+
```
|
318 |
@article{sileo2023tasksource,
|
319 |
title={tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation},
|
320 |
author={Sileo, Damien},
|