gonglinyuan commited on
Commit
0ed5b88
·
1 Parent(s): b6b9b0a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -1
README.md CHANGED
@@ -113,4 +113,57 @@ model-index:
113
  metrics:
114
  - type: accuracy
115
  value: 62.88401253918495
116
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
113
  metrics:
114
  - type: accuracy
115
  value: 62.88401253918495
116
+ ---
117
+
118
+ Official repository: https://github.com/gonglinyuan/metro_t0
119
+
120
+ # METRO-T0
121
+
122
+ Paper: Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers (TODO) (ACL 2023)
123
+
124
+ METRO-T0 is a T5-style text-to-text Transformer pretrained using model-generated pretraining signals, prompt-finetuned on a family of public NLP tasks proposed in [T0](https://arxiv.org/abs/2110.08207).
125
+ METRO-T0 is highly parameter efficient. For example, METRO-T0-Large++ (775M parameters) outperforms GPT-3 (175B parameters) and T0-3B (3B parameters) on a wide range of NLP tasks.
126
+
127
+ ![The architecture of METRO-T0 during pretraining using BERT as the auxiliary model to generate signals](https://github.com/gonglinyuan/metro_t0/raw/main/assets/metro_t0_method.png)
128
+
129
+ ![Prompt learning results of METRO-T0 versus our T0 baseline and T03B by Sanh et al. (2022) on 4 tasks in the T0 Eval benchmark. Each point denotes the accuracy using one prompt template, except that the median accuracy over all templates of T03B is indicated by the blue point. The plots of other tasks are in our paper.](https://github.com/gonglinyuan/metro_t0/raw/main/assets/metro_t0_selected_results.png)
130
+
131
+ ## Use METRO-T0-Base
132
+
133
+ To use METRO-T0-Base in PyTorch (Python 3.7+, PyTorch 1.12+ and transformers 4.17+ are prerequisites), refer to the code snippet below:
134
+
135
+ ```python
136
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
137
+
138
+ model = AutoModelForSeq2SeqLM.from_pretrained("gonglinyuan/metro_t0_base", trust_remote_code=True)
139
+ tokenizer = AutoTokenizer.from_pretrained("gonglinyuan/metro_t0_base", trust_remote_code=True)
140
+
141
+ input_text = "Is this review positive or negative? Review: this is the best cast iron skillet you will ever buy"
142
+ inputs = tokenizer([input_text], max_length=512, truncation=True, add_special_tokens=True, return_tensors="pt").input_ids
143
+ outputs = model.generate(inputs, max_new_tokens=256, do_sample=False)
144
+
145
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True)) # expected: positive
146
+ ```
147
+
148
+ ## Other METRO-T0 Models
149
+
150
+ | | # Parameters | Pretraining Data | Prompt-Finetuning Data |
151
+ |--------------------|--------------|------------------|------------------------|
152
+ | [METRO-T0-Base](https://huggingface.co/gonglinyuan/metro_t0_base) | 226M | Wikibook (16G) | T0 Train |
153
+ | [METRO-T0+-Base](https://huggingface.co/gonglinyuan/metro_t0p_base) | 226M | Wikibook (16G) | T0+ Train |
154
+ | [METRO-T0++-Base](https://huggingface.co/gonglinyuan/metro_t0pp_base) | 226M | Wikibook (16G) | T0++ Train |
155
+ | [METRO-T0-Base++](https://huggingface.co/gonglinyuan/metro_t0_basepp) | 256M | 160G corpus | T0 Train |
156
+ | [METRO-T0+-Base++](https://huggingface.co/gonglinyuan/metro_t0p_basepp) | 256M | 160G corpus | T0+ Train |
157
+ | [METRO-T0++-Base++](https://huggingface.co/gonglinyuan/metro_t0pp_basepp) | 256M | 160G corpus | T0++ Train |
158
+ | [METRO-T0-Large++](https://huggingface.co/gonglinyuan/metro_t0_largepp) | 775M | 160G corpus | T0 Train |
159
+ | [METRO-T0+-Large++](https://huggingface.co/gonglinyuan/metro_t0p_largepp) | 775M | 160G corpus | T0+ Train |
160
+ | [METRO-T0++-Large++](https://huggingface.co/gonglinyuan/metro_t0pp_largepp) | 775M | 160G corpus | T0++ Train |
161
+
162
+
163
+ ## Citation
164
+
165
+ If you find the code and models useful for your research, please cite the following paper:
166
+
167
+ ```
168
+ TODO
169
+ ```