RichardErkhov commited on
Commit
95b2eed
·
verified ·
1 Parent(s): f898101

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +268 -0
README.md ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ tinyroberta-squad2 - bnb 8bits
11
+ - Model creator: https://huggingface.co/deepset/
12
+ - Original model: https://huggingface.co/deepset/tinyroberta-squad2/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ language: en
20
+ license: cc-by-4.0
21
+ datasets:
22
+ - squad_v2
23
+ model-index:
24
+ - name: deepset/tinyroberta-squad2
25
+ results:
26
+ - task:
27
+ type: question-answering
28
+ name: Question Answering
29
+ dataset:
30
+ name: squad_v2
31
+ type: squad_v2
32
+ config: squad_v2
33
+ split: validation
34
+ metrics:
35
+ - type: exact_match
36
+ value: 78.8627
37
+ name: Exact Match
38
+ verified: true
39
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNDNlZDU4ODAxMzY5NGFiMTMyZmQ1M2ZhZjMyODA1NmFlOGMxNzYxNTA4OGE5YTBkZWViZjBkNGQ2ZmMxZjVlMCIsInZlcnNpb24iOjF9.Wgu599r6TvgMLTrHlLMVAbUtKD_3b70iJ5QSeDQ-bRfUsVk6Sz9OsJCp47riHJVlmSYzcDj_z_3jTcUjCFFXBg
40
+ - type: f1
41
+ value: 82.0355
42
+ name: F1
43
+ verified: true
44
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTFkMzEzMWNiZDRhMGZlODhkYzcwZTZiMDFjZDg2YjllZmUzYWM5NTgwNGQ2NGYyMDk2ZGQwN2JmMTE5NTc3YiIsInZlcnNpb24iOjF9.ChgaYpuRHd5WeDFjtiAHUyczxtoOD_M5WR8834jtbf7wXhdGOnZKdZ1KclmhoI5NuAGc1NptX-G0zQ5FTHEcBA
45
+ - task:
46
+ type: question-answering
47
+ name: Question Answering
48
+ dataset:
49
+ name: squad
50
+ type: squad
51
+ config: plain_text
52
+ split: validation
53
+ metrics:
54
+ - type: exact_match
55
+ value: 83.860
56
+ name: Exact Match
57
+ - type: f1
58
+ value: 90.752
59
+ name: F1
60
+ - task:
61
+ type: question-answering
62
+ name: Question Answering
63
+ dataset:
64
+ name: adversarial_qa
65
+ type: adversarial_qa
66
+ config: adversarialQA
67
+ split: validation
68
+ metrics:
69
+ - type: exact_match
70
+ value: 25.967
71
+ name: Exact Match
72
+ - type: f1
73
+ value: 37.006
74
+ name: F1
75
+ - task:
76
+ type: question-answering
77
+ name: Question Answering
78
+ dataset:
79
+ name: squad_adversarial
80
+ type: squad_adversarial
81
+ config: AddOneSent
82
+ split: validation
83
+ metrics:
84
+ - type: exact_match
85
+ value: 76.329
86
+ name: Exact Match
87
+ - type: f1
88
+ value: 83.292
89
+ name: F1
90
+ - task:
91
+ type: question-answering
92
+ name: Question Answering
93
+ dataset:
94
+ name: squadshifts amazon
95
+ type: squadshifts
96
+ config: amazon
97
+ split: test
98
+ metrics:
99
+ - type: exact_match
100
+ value: 63.915
101
+ name: Exact Match
102
+ - type: f1
103
+ value: 78.395
104
+ name: F1
105
+ - task:
106
+ type: question-answering
107
+ name: Question Answering
108
+ dataset:
109
+ name: squadshifts new_wiki
110
+ type: squadshifts
111
+ config: new_wiki
112
+ split: test
113
+ metrics:
114
+ - type: exact_match
115
+ value: 80.297
116
+ name: Exact Match
117
+ - type: f1
118
+ value: 89.808
119
+ name: F1
120
+ - task:
121
+ type: question-answering
122
+ name: Question Answering
123
+ dataset:
124
+ name: squadshifts nyt
125
+ type: squadshifts
126
+ config: nyt
127
+ split: test
128
+ metrics:
129
+ - type: exact_match
130
+ value: 80.149
131
+ name: Exact Match
132
+ - type: f1
133
+ value: 88.321
134
+ name: F1
135
+ - task:
136
+ type: question-answering
137
+ name: Question Answering
138
+ dataset:
139
+ name: squadshifts reddit
140
+ type: squadshifts
141
+ config: reddit
142
+ split: test
143
+ metrics:
144
+ - type: exact_match
145
+ value: 66.959
146
+ name: Exact Match
147
+ - type: f1
148
+ value: 79.300
149
+ name: F1
150
+ ---
151
+
152
+ # tinyroberta-squad2
153
+
154
+ This is the *distilled* version of the [deepset/roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2) model. This model has a comparable prediction quality and runs at twice the speed of the base model.
155
+
156
+ ## Overview
157
+ **Language model:** tinyroberta-squad2
158
+ **Language:** English
159
+ **Downstream-task:** Extractive QA
160
+ **Training data:** SQuAD 2.0
161
+ **Eval data:** SQuAD 2.0
162
+ **Code:** See [an example QA pipeline on Haystack](https://haystack.deepset.ai/tutorials/first-qa-system)
163
+ **Infrastructure**: 4x Tesla v100
164
+
165
+ ## Hyperparameters
166
+
167
+ ```
168
+ batch_size = 96
169
+ n_epochs = 4
170
+ base_LM_model = "deepset/tinyroberta-squad2-step1"
171
+ max_seq_len = 384
172
+ learning_rate = 3e-5
173
+ lr_schedule = LinearWarmup
174
+ warmup_proportion = 0.2
175
+ doc_stride = 128
176
+ max_query_length = 64
177
+ distillation_loss_weight = 0.75
178
+ temperature = 1.5
179
+ teacher = "deepset/robert-large-squad2"
180
+ ```
181
+
182
+ ## Distillation
183
+ This model was distilled using the TinyBERT approach described in [this paper](https://arxiv.org/pdf/1909.10351.pdf) and implemented in [haystack](https://github.com/deepset-ai/haystack).
184
+ Firstly, we have performed intermediate layer distillation with roberta-base as the teacher which resulted in [deepset/tinyroberta-6l-768d](https://huggingface.co/deepset/tinyroberta-6l-768d).
185
+ Secondly, we have performed task-specific distillation with [deepset/roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2) as the teacher for further intermediate layer distillation on an augmented version of SQuADv2 and then with [deepset/roberta-large-squad2](https://huggingface.co/deepset/roberta-large-squad2) as the teacher for prediction layer distillation.
186
+
187
+ ## Usage
188
+
189
+ ### In Haystack
190
+ Haystack is an NLP framework by deepset. You can use this model in a Haystack pipeline to do question answering at scale (over many documents). To load the model in [Haystack](https://github.com/deepset-ai/haystack/):
191
+
192
+ ```python
193
+ reader = FARMReader(model_name_or_path="deepset/tinyroberta-squad2")
194
+ # or
195
+ reader = TransformersReader(model_name_or_path="deepset/tinyroberta-squad2")
196
+ ```
197
+
198
+ ### In Transformers
199
+ ```python
200
+ from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
201
+
202
+ model_name = "deepset/tinyroberta-squad2"
203
+
204
+ # a) Get predictions
205
+ nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
206
+ QA_input = {
207
+ 'question': 'Why is model conversion important?',
208
+ 'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
209
+ }
210
+ res = nlp(QA_input)
211
+
212
+ # b) Load model & tokenizer
213
+ model = AutoModelForQuestionAnswering.from_pretrained(model_name)
214
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
215
+ ```
216
+
217
+ ## Performance
218
+ Evaluated on the SQuAD 2.0 dev set with the [official eval script](https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/).
219
+
220
+ ```
221
+ "exact": 78.69114798281817,
222
+ "f1": 81.9198998536977,
223
+
224
+ "total": 11873,
225
+ "HasAns_exact": 76.19770580296895,
226
+ "HasAns_f1": 82.66446878592329,
227
+ "HasAns_total": 5928,
228
+ "NoAns_exact": 81.17746005046257,
229
+ "NoAns_f1": 81.17746005046257,
230
+ "NoAns_total": 5945
231
+ ```
232
+
233
+ ## Authors
234
+ **Branden Chan:** [email protected]
235
+ **Timo Möller:** [email protected]
236
+ **Malte Pietsch:** [email protected]
237
+ **Tanay Soni:** [email protected]
238
+ **Michel Bartels:** [email protected]
239
+
240
+ ## About us
241
+
242
+ <div class="grid lg:grid-cols-2 gap-x-4 gap-y-3">
243
+ <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
244
+ <img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/deepset-logo-colored.png" class="w-40"/>
245
+ </div>
246
+ <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
247
+ <img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/haystack-logo-colored.png" class="w-40"/>
248
+ </div>
249
+ </div>
250
+
251
+ [deepset](http://deepset.ai/) is the company behind the open-source NLP framework [Haystack](https://haystack.deepset.ai/) which is designed to help you build production ready NLP systems that use: Question answering, summarization, ranking etc.
252
+
253
+
254
+ Some of our other work:
255
+ - [roberta-base-squad2]([https://huggingface.co/deepset/roberta-base-squad2)
256
+ - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert)
257
+ - [GermanQuAD and GermanDPR datasets and models (aka "gelectra-base-germanquad", "gbert-base-germandpr")](https://deepset.ai/germanquad)
258
+
259
+ ## Get in touch and join the Haystack community
260
+
261
+ <p>For more info on Haystack, visit our <strong><a href="https://github.com/deepset-ai/haystack">GitHub</a></strong> repo and <strong><a href="https://docs.haystack.deepset.ai">Documentation</a></strong>.
262
+
263
+ We also have a <strong><a class="h-7" href="https://haystack.deepset.ai/community/join">Discord community open to everyone!</a></strong></p>
264
+
265
+ [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Discord](https://haystack.deepset.ai/community) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
266
+
267
+ By the way: [we're hiring!](http://www.deepset.ai/jobs)
268
+