AmedeoBonatti commited on
Commit
c488321
·
verified ·
1 Parent(s): 6f8c496

AmedeoBonatti/nlp_te_mlm_scibert

Browse files
README.md ADDED
@@ -0,0 +1,557 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: allenai/scibert_scivocab_uncased
3
+ tags:
4
+ - generated_from_trainer
5
+ model-index:
6
+ - name: mlm_scibert_uncased
7
+ results: []
8
+ ---
9
+
10
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
+ should probably proofread and complete it, then remove this comment. -->
12
+
13
+ # mlm_scibert_uncased
14
+
15
+ This model is a fine-tuned version of [allenai/scibert_scivocab_uncased](https://huggingface.co/allenai/scibert_scivocab_uncased) on an unknown dataset.
16
+ It achieves the following results on the evaluation set:
17
+ - Loss: 1.2966
18
+
19
+ ## Model description
20
+
21
+ More information needed
22
+
23
+ ## Intended uses & limitations
24
+
25
+ More information needed
26
+
27
+ ## Training and evaluation data
28
+
29
+ More information needed
30
+
31
+ ## Training procedure
32
+
33
+ ### Training hyperparameters
34
+
35
+ The following hyperparameters were used during training:
36
+ - learning_rate: 0.0001
37
+ - train_batch_size: 16
38
+ - eval_batch_size: 8
39
+ - seed: 1234
40
+ - gradient_accumulation_steps: 16
41
+ - total_train_batch_size: 256
42
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
+ - lr_scheduler_type: linear
44
+ - num_epochs: 500
45
+ - mixed_precision_training: Native AMP
46
+
47
+ ### Training results
48
+
49
+ | Training Loss | Epoch | Step | Validation Loss |
50
+ |:-------------:|:--------:|:-----:|:---------------:|
51
+ | 1.3835 | 0.9963 | 152 | 1.2575 |
52
+ | 1.3098 | 1.9992 | 305 | 1.2320 |
53
+ | 1.2883 | 2.9955 | 457 | 1.2204 |
54
+ | 1.2612 | 3.9984 | 610 | 1.2113 |
55
+ | 1.2501 | 4.9947 | 762 | 1.2043 |
56
+ | 1.2292 | 5.9975 | 915 | 1.1910 |
57
+ | 1.2269 | 6.9939 | 1067 | 1.1859 |
58
+ | 1.2096 | 7.9967 | 1220 | 1.1841 |
59
+ | 1.1992 | 8.9996 | 1373 | 1.1772 |
60
+ | 1.1973 | 9.9959 | 1525 | 1.1806 |
61
+ | 1.1801 | 10.9988 | 1678 | 1.1720 |
62
+ | 1.1822 | 11.9951 | 1830 | 1.1699 |
63
+ | 1.1693 | 12.9980 | 1983 | 1.1674 |
64
+ | 1.1677 | 13.9943 | 2135 | 1.1641 |
65
+ | 1.1529 | 14.9971 | 2288 | 1.1597 |
66
+ | 1.1448 | 16.0 | 2441 | 1.1613 |
67
+ | 1.1499 | 16.9963 | 2593 | 1.1579 |
68
+ | 1.1363 | 17.9992 | 2746 | 1.1580 |
69
+ | 1.1369 | 18.9955 | 2898 | 1.1640 |
70
+ | 1.1247 | 19.9984 | 3051 | 1.1560 |
71
+ | 1.1283 | 20.9947 | 3203 | 1.1505 |
72
+ | 1.1173 | 21.9975 | 3356 | 1.1497 |
73
+ | 1.1171 | 22.9939 | 3508 | 1.1522 |
74
+ | 1.1064 | 23.9967 | 3661 | 1.1505 |
75
+ | 1.1034 | 24.9996 | 3814 | 1.1471 |
76
+ | 1.1051 | 25.9959 | 3966 | 1.1447 |
77
+ | 1.0949 | 26.9988 | 4119 | 1.1414 |
78
+ | 1.0975 | 27.9951 | 4271 | 1.1470 |
79
+ | 1.0852 | 28.9980 | 4424 | 1.1497 |
80
+ | 1.0888 | 29.9943 | 4576 | 1.1472 |
81
+ | 1.0779 | 30.9971 | 4729 | 1.1418 |
82
+ | 1.0741 | 32.0 | 4882 | 1.1430 |
83
+ | 1.0761 | 32.9963 | 5034 | 1.1448 |
84
+ | 1.0674 | 33.9992 | 5187 | 1.1447 |
85
+ | 1.0712 | 34.9955 | 5339 | 1.1451 |
86
+ | 1.0604 | 35.9984 | 5492 | 1.1440 |
87
+ | 1.0612 | 36.9947 | 5644 | 1.1423 |
88
+ | 1.0549 | 37.9975 | 5797 | 1.1460 |
89
+ | 1.0553 | 38.9939 | 5949 | 1.1456 |
90
+ | 1.0469 | 39.9967 | 6102 | 1.1436 |
91
+ | 1.0411 | 40.9996 | 6255 | 1.1401 |
92
+ | 1.0474 | 41.9959 | 6407 | 1.1395 |
93
+ | 1.0373 | 42.9988 | 6560 | 1.1423 |
94
+ | 1.0399 | 43.9951 | 6712 | 1.1442 |
95
+ | 1.0317 | 44.9980 | 6865 | 1.1443 |
96
+ | 1.0355 | 45.9943 | 7017 | 1.1427 |
97
+ | 1.0259 | 46.9971 | 7170 | 1.1424 |
98
+ | 1.0228 | 48.0 | 7323 | 1.1396 |
99
+ | 1.0285 | 48.9963 | 7475 | 1.1434 |
100
+ | 1.0179 | 49.9992 | 7628 | 1.1407 |
101
+ | 1.0209 | 50.9955 | 7780 | 1.1427 |
102
+ | 1.0132 | 51.9984 | 7933 | 1.1418 |
103
+ | 1.0159 | 52.9947 | 8085 | 1.1344 |
104
+ | 1.0058 | 53.9975 | 8238 | 1.1401 |
105
+ | 1.0113 | 54.9939 | 8390 | 1.1429 |
106
+ | 1.0021 | 55.9967 | 8543 | 1.1424 |
107
+ | 0.9995 | 56.9996 | 8696 | 1.1426 |
108
+ | 1.0048 | 57.9959 | 8848 | 1.1389 |
109
+ | 0.9951 | 58.9988 | 9001 | 1.1387 |
110
+ | 1.0011 | 59.9951 | 9153 | 1.1410 |
111
+ | 0.9901 | 60.9980 | 9306 | 1.1399 |
112
+ | 0.9925 | 61.9943 | 9458 | 1.1416 |
113
+ | 0.9835 | 62.9971 | 9611 | 1.1416 |
114
+ | 0.9846 | 64.0 | 9764 | 1.1458 |
115
+ | 0.9878 | 64.9963 | 9916 | 1.1452 |
116
+ | 0.9792 | 65.9992 | 10069 | 1.1459 |
117
+ | 0.9813 | 66.9955 | 10221 | 1.1415 |
118
+ | 0.9747 | 67.9984 | 10374 | 1.1476 |
119
+ | 0.9764 | 68.9947 | 10526 | 1.1474 |
120
+ | 0.971 | 69.9975 | 10679 | 1.1509 |
121
+ | 0.9728 | 70.9939 | 10831 | 1.1441 |
122
+ | 0.9672 | 71.9967 | 10984 | 1.1466 |
123
+ | 0.9627 | 72.9996 | 11137 | 1.1425 |
124
+ | 0.9678 | 73.9959 | 11289 | 1.1445 |
125
+ | 0.9609 | 74.9988 | 11442 | 1.1435 |
126
+ | 0.9636 | 75.9951 | 11594 | 1.1408 |
127
+ | 0.9553 | 76.9980 | 11747 | 1.1468 |
128
+ | 0.9608 | 77.9943 | 11899 | 1.1460 |
129
+ | 0.9506 | 78.9971 | 12052 | 1.1475 |
130
+ | 0.9505 | 80.0 | 12205 | 1.1460 |
131
+ | 0.9535 | 80.9963 | 12357 | 1.1467 |
132
+ | 0.9471 | 81.9992 | 12510 | 1.1495 |
133
+ | 0.9509 | 82.9955 | 12662 | 1.1484 |
134
+ | 0.9412 | 83.9984 | 12815 | 1.1486 |
135
+ | 0.9456 | 84.9947 | 12967 | 1.1468 |
136
+ | 0.9385 | 85.9975 | 13120 | 1.1497 |
137
+ | 0.945 | 86.9939 | 13272 | 1.1501 |
138
+ | 0.9351 | 87.9967 | 13425 | 1.1483 |
139
+ | 0.9324 | 88.9996 | 13578 | 1.1497 |
140
+ | 0.9376 | 89.9959 | 13730 | 1.1501 |
141
+ | 0.9295 | 90.9988 | 13883 | 1.1469 |
142
+ | 0.9345 | 91.9951 | 14035 | 1.1554 |
143
+ | 0.9267 | 92.9980 | 14188 | 1.1485 |
144
+ | 0.931 | 93.9943 | 14340 | 1.1508 |
145
+ | 0.9225 | 94.9971 | 14493 | 1.1536 |
146
+ | 0.9208 | 96.0 | 14646 | 1.1495 |
147
+ | 0.9254 | 96.9963 | 14798 | 1.1522 |
148
+ | 0.9177 | 97.9992 | 14951 | 1.1550 |
149
+ | 0.9199 | 98.9955 | 15103 | 1.1575 |
150
+ | 0.9144 | 99.9984 | 15256 | 1.1563 |
151
+ | 0.9174 | 100.9947 | 15408 | 1.1518 |
152
+ | 0.911 | 101.9975 | 15561 | 1.1560 |
153
+ | 0.9135 | 102.9939 | 15713 | 1.1543 |
154
+ | 0.9044 | 103.9967 | 15866 | 1.1549 |
155
+ | 0.905 | 104.9996 | 16019 | 1.1568 |
156
+ | 0.9106 | 105.9959 | 16171 | 1.1567 |
157
+ | 0.902 | 106.9988 | 16324 | 1.1555 |
158
+ | 0.9068 | 107.9951 | 16476 | 1.1580 |
159
+ | 0.8973 | 108.9980 | 16629 | 1.1562 |
160
+ | 0.9038 | 109.9943 | 16781 | 1.1612 |
161
+ | 0.8957 | 110.9971 | 16934 | 1.1514 |
162
+ | 0.8949 | 112.0 | 17087 | 1.1571 |
163
+ | 0.8989 | 112.9963 | 17239 | 1.1634 |
164
+ | 0.8927 | 113.9992 | 17392 | 1.1621 |
165
+ | 0.8954 | 114.9955 | 17544 | 1.1572 |
166
+ | 0.8876 | 115.9984 | 17697 | 1.1604 |
167
+ | 0.8917 | 116.9947 | 17849 | 1.1660 |
168
+ | 0.8841 | 117.9975 | 18002 | 1.1564 |
169
+ | 0.8893 | 118.9939 | 18154 | 1.1624 |
170
+ | 0.8808 | 119.9967 | 18307 | 1.1668 |
171
+ | 0.8825 | 120.9996 | 18460 | 1.1608 |
172
+ | 0.8848 | 121.9959 | 18612 | 1.1600 |
173
+ | 0.878 | 122.9988 | 18765 | 1.1650 |
174
+ | 0.8818 | 123.9951 | 18917 | 1.1671 |
175
+ | 0.8748 | 124.9980 | 19070 | 1.1668 |
176
+ | 0.8787 | 125.9943 | 19222 | 1.1605 |
177
+ | 0.8727 | 126.9971 | 19375 | 1.1649 |
178
+ | 0.8701 | 128.0 | 19528 | 1.1675 |
179
+ | 0.875 | 128.9963 | 19680 | 1.1639 |
180
+ | 0.8669 | 129.9992 | 19833 | 1.1698 |
181
+ | 0.8714 | 130.9955 | 19985 | 1.1726 |
182
+ | 0.8657 | 131.9984 | 20138 | 1.1680 |
183
+ | 0.8682 | 132.9947 | 20290 | 1.1695 |
184
+ | 0.8623 | 133.9975 | 20443 | 1.1774 |
185
+ | 0.8659 | 134.9939 | 20595 | 1.1718 |
186
+ | 0.8606 | 135.9967 | 20748 | 1.1691 |
187
+ | 0.8587 | 136.9996 | 20901 | 1.1668 |
188
+ | 0.8635 | 137.9959 | 21053 | 1.1742 |
189
+ | 0.8567 | 138.9988 | 21206 | 1.1707 |
190
+ | 0.8607 | 139.9951 | 21358 | 1.1756 |
191
+ | 0.8519 | 140.9980 | 21511 | 1.1742 |
192
+ | 0.8558 | 141.9943 | 21663 | 1.1733 |
193
+ | 0.8518 | 142.9971 | 21816 | 1.1761 |
194
+ | 0.85 | 144.0 | 21969 | 1.1734 |
195
+ | 0.8536 | 144.9963 | 22121 | 1.1788 |
196
+ | 0.8469 | 145.9992 | 22274 | 1.1782 |
197
+ | 0.85 | 146.9955 | 22426 | 1.1773 |
198
+ | 0.8416 | 147.9984 | 22579 | 1.1731 |
199
+ | 0.8496 | 148.9947 | 22731 | 1.1767 |
200
+ | 0.842 | 149.9975 | 22884 | 1.1743 |
201
+ | 0.8452 | 150.9939 | 23036 | 1.1778 |
202
+ | 0.8379 | 151.9967 | 23189 | 1.1843 |
203
+ | 0.8379 | 152.9996 | 23342 | 1.1804 |
204
+ | 0.8425 | 153.9959 | 23494 | 1.1803 |
205
+ | 0.8332 | 154.9988 | 23647 | 1.1818 |
206
+ | 0.8394 | 155.9951 | 23799 | 1.1805 |
207
+ | 0.8307 | 156.9980 | 23952 | 1.1841 |
208
+ | 0.836 | 157.9943 | 24104 | 1.1835 |
209
+ | 0.8305 | 158.9971 | 24257 | 1.1823 |
210
+ | 0.8298 | 160.0 | 24410 | 1.1768 |
211
+ | 0.8329 | 160.9963 | 24562 | 1.1836 |
212
+ | 0.8271 | 161.9992 | 24715 | 1.1841 |
213
+ | 0.8316 | 162.9955 | 24867 | 1.1848 |
214
+ | 0.825 | 163.9984 | 25020 | 1.1807 |
215
+ | 0.8287 | 164.9947 | 25172 | 1.1866 |
216
+ | 0.821 | 165.9975 | 25325 | 1.1866 |
217
+ | 0.8249 | 166.9939 | 25477 | 1.1887 |
218
+ | 0.8188 | 167.9967 | 25630 | 1.1882 |
219
+ | 0.8192 | 168.9996 | 25783 | 1.1891 |
220
+ | 0.8215 | 169.9959 | 25935 | 1.1921 |
221
+ | 0.8162 | 170.9988 | 26088 | 1.1891 |
222
+ | 0.8213 | 171.9951 | 26240 | 1.1929 |
223
+ | 0.8145 | 172.9980 | 26393 | 1.1881 |
224
+ | 0.8177 | 173.9943 | 26545 | 1.1878 |
225
+ | 0.8123 | 174.9971 | 26698 | 1.1919 |
226
+ | 0.8097 | 176.0 | 26851 | 1.1922 |
227
+ | 0.8156 | 176.9963 | 27003 | 1.1957 |
228
+ | 0.8077 | 177.9992 | 27156 | 1.1945 |
229
+ | 0.812 | 178.9955 | 27308 | 1.1942 |
230
+ | 0.8069 | 179.9984 | 27461 | 1.1913 |
231
+ | 0.8108 | 180.9947 | 27613 | 1.1962 |
232
+ | 0.8041 | 181.9975 | 27766 | 1.1992 |
233
+ | 0.8072 | 182.9939 | 27918 | 1.1976 |
234
+ | 0.8021 | 183.9967 | 28071 | 1.1981 |
235
+ | 0.8018 | 184.9996 | 28224 | 1.1958 |
236
+ | 0.8041 | 185.9959 | 28376 | 1.2022 |
237
+ | 0.7978 | 186.9988 | 28529 | 1.1981 |
238
+ | 0.8019 | 187.9951 | 28681 | 1.1957 |
239
+ | 0.7966 | 188.9980 | 28834 | 1.1995 |
240
+ | 0.7989 | 189.9943 | 28986 | 1.1947 |
241
+ | 0.7928 | 190.9971 | 29139 | 1.1966 |
242
+ | 0.7915 | 192.0 | 29292 | 1.2022 |
243
+ | 0.7975 | 192.9963 | 29444 | 1.2062 |
244
+ | 0.7918 | 193.9992 | 29597 | 1.2031 |
245
+ | 0.7952 | 194.9955 | 29749 | 1.2034 |
246
+ | 0.7894 | 195.9984 | 29902 | 1.2060 |
247
+ | 0.791 | 196.9947 | 30054 | 1.2040 |
248
+ | 0.7868 | 197.9975 | 30207 | 1.2054 |
249
+ | 0.7899 | 198.9939 | 30359 | 1.2046 |
250
+ | 0.7859 | 199.9967 | 30512 | 1.2023 |
251
+ | 0.7851 | 200.9996 | 30665 | 1.2075 |
252
+ | 0.7885 | 201.9959 | 30817 | 1.2074 |
253
+ | 0.7822 | 202.9988 | 30970 | 1.2052 |
254
+ | 0.7868 | 203.9951 | 31122 | 1.2048 |
255
+ | 0.7809 | 204.9980 | 31275 | 1.2070 |
256
+ | 0.7847 | 205.9943 | 31427 | 1.2096 |
257
+ | 0.7778 | 206.9971 | 31580 | 1.2082 |
258
+ | 0.7782 | 208.0 | 31733 | 1.2147 |
259
+ | 0.7813 | 208.9963 | 31885 | 1.2137 |
260
+ | 0.775 | 209.9992 | 32038 | 1.2115 |
261
+ | 0.7785 | 210.9955 | 32190 | 1.2203 |
262
+ | 0.7733 | 211.9984 | 32343 | 1.2108 |
263
+ | 0.7771 | 212.9947 | 32495 | 1.2173 |
264
+ | 0.7711 | 213.9975 | 32648 | 1.2123 |
265
+ | 0.7765 | 214.9939 | 32800 | 1.2156 |
266
+ | 0.77 | 215.9967 | 32953 | 1.2182 |
267
+ | 0.7673 | 216.9996 | 33106 | 1.2223 |
268
+ | 0.774 | 217.9959 | 33258 | 1.2144 |
269
+ | 0.7666 | 218.9988 | 33411 | 1.2144 |
270
+ | 0.7721 | 219.9951 | 33563 | 1.2165 |
271
+ | 0.7646 | 220.9980 | 33716 | 1.2195 |
272
+ | 0.769 | 221.9943 | 33868 | 1.2157 |
273
+ | 0.7625 | 222.9971 | 34021 | 1.2166 |
274
+ | 0.7619 | 224.0 | 34174 | 1.2171 |
275
+ | 0.7662 | 224.9963 | 34326 | 1.2183 |
276
+ | 0.7585 | 225.9992 | 34479 | 1.2243 |
277
+ | 0.764 | 226.9955 | 34631 | 1.2159 |
278
+ | 0.76 | 227.9984 | 34784 | 1.2215 |
279
+ | 0.7619 | 228.9947 | 34936 | 1.2161 |
280
+ | 0.758 | 229.9975 | 35089 | 1.2174 |
281
+ | 0.7613 | 230.9939 | 35241 | 1.2236 |
282
+ | 0.7547 | 231.9967 | 35394 | 1.2234 |
283
+ | 0.7562 | 232.9996 | 35547 | 1.2258 |
284
+ | 0.7572 | 233.9959 | 35699 | 1.2218 |
285
+ | 0.7514 | 234.9988 | 35852 | 1.2235 |
286
+ | 0.7559 | 235.9951 | 36004 | 1.2264 |
287
+ | 0.7515 | 236.9980 | 36157 | 1.2243 |
288
+ | 0.7555 | 237.9943 | 36309 | 1.2245 |
289
+ | 0.7497 | 238.9971 | 36462 | 1.2238 |
290
+ | 0.7467 | 240.0 | 36615 | 1.2260 |
291
+ | 0.7524 | 240.9963 | 36767 | 1.2251 |
292
+ | 0.7448 | 241.9992 | 36920 | 1.2267 |
293
+ | 0.7498 | 242.9955 | 37072 | 1.2293 |
294
+ | 0.7433 | 243.9984 | 37225 | 1.2358 |
295
+ | 0.7468 | 244.9947 | 37377 | 1.2337 |
296
+ | 0.7431 | 245.9975 | 37530 | 1.2285 |
297
+ | 0.7474 | 246.9939 | 37682 | 1.2304 |
298
+ | 0.7413 | 247.9967 | 37835 | 1.2341 |
299
+ | 0.7385 | 248.9996 | 37988 | 1.2318 |
300
+ | 0.7453 | 249.9959 | 38140 | 1.2336 |
301
+ | 0.7377 | 250.9988 | 38293 | 1.2301 |
302
+ | 0.7415 | 251.9951 | 38445 | 1.2303 |
303
+ | 0.7388 | 252.9980 | 38598 | 1.2327 |
304
+ | 0.7397 | 253.9943 | 38750 | 1.2364 |
305
+ | 0.7347 | 254.9971 | 38903 | 1.2324 |
306
+ | 0.7334 | 256.0 | 39056 | 1.2358 |
307
+ | 0.7407 | 256.9963 | 39208 | 1.2335 |
308
+ | 0.7322 | 257.9992 | 39361 | 1.2353 |
309
+ | 0.7354 | 258.9955 | 39513 | 1.2348 |
310
+ | 0.7287 | 259.9984 | 39666 | 1.2342 |
311
+ | 0.7351 | 260.9947 | 39818 | 1.2341 |
312
+ | 0.7294 | 261.9975 | 39971 | 1.2317 |
313
+ | 0.7321 | 262.9939 | 40123 | 1.2390 |
314
+ | 0.7278 | 263.9967 | 40276 | 1.2386 |
315
+ | 0.7264 | 264.9996 | 40429 | 1.2357 |
316
+ | 0.7303 | 265.9959 | 40581 | 1.2428 |
317
+ | 0.7254 | 266.9988 | 40734 | 1.2405 |
318
+ | 0.7273 | 267.9951 | 40886 | 1.2439 |
319
+ | 0.7248 | 268.9980 | 41039 | 1.2351 |
320
+ | 0.7293 | 269.9943 | 41191 | 1.2394 |
321
+ | 0.7217 | 270.9971 | 41344 | 1.2433 |
322
+ | 0.7212 | 272.0 | 41497 | 1.2461 |
323
+ | 0.7256 | 272.9963 | 41649 | 1.2419 |
324
+ | 0.7189 | 273.9992 | 41802 | 1.2393 |
325
+ | 0.7247 | 274.9955 | 41954 | 1.2442 |
326
+ | 0.7186 | 275.9984 | 42107 | 1.2400 |
327
+ | 0.7242 | 276.9947 | 42259 | 1.2433 |
328
+ | 0.7165 | 277.9975 | 42412 | 1.2464 |
329
+ | 0.7208 | 278.9939 | 42564 | 1.2397 |
330
+ | 0.7142 | 279.9967 | 42717 | 1.2488 |
331
+ | 0.7161 | 280.9996 | 42870 | 1.2467 |
332
+ | 0.7182 | 281.9959 | 43022 | 1.2499 |
333
+ | 0.7145 | 282.9988 | 43175 | 1.2444 |
334
+ | 0.7182 | 283.9951 | 43327 | 1.2507 |
335
+ | 0.7117 | 284.9980 | 43480 | 1.2477 |
336
+ | 0.715 | 285.9943 | 43632 | 1.2499 |
337
+ | 0.7122 | 286.9971 | 43785 | 1.2483 |
338
+ | 0.7101 | 288.0 | 43938 | 1.2442 |
339
+ | 0.7138 | 288.9963 | 44090 | 1.2497 |
340
+ | 0.7078 | 289.9992 | 44243 | 1.2477 |
341
+ | 0.7111 | 290.9955 | 44395 | 1.2485 |
342
+ | 0.7053 | 291.9984 | 44548 | 1.2483 |
343
+ | 0.7105 | 292.9947 | 44700 | 1.2529 |
344
+ | 0.7056 | 293.9975 | 44853 | 1.2566 |
345
+ | 0.7088 | 294.9939 | 45005 | 1.2476 |
346
+ | 0.7054 | 295.9967 | 45158 | 1.2536 |
347
+ | 0.704 | 296.9996 | 45311 | 1.2519 |
348
+ | 0.7082 | 297.9959 | 45463 | 1.2581 |
349
+ | 0.7009 | 298.9988 | 45616 | 1.2609 |
350
+ | 0.7052 | 299.9951 | 45768 | 1.2549 |
351
+ | 0.6984 | 300.9980 | 45921 | 1.2517 |
352
+ | 0.7056 | 301.9943 | 46073 | 1.2585 |
353
+ | 0.7002 | 302.9971 | 46226 | 1.2567 |
354
+ | 0.6981 | 304.0 | 46379 | 1.2573 |
355
+ | 0.7016 | 304.9963 | 46531 | 1.2585 |
356
+ | 0.6971 | 305.9992 | 46684 | 1.2632 |
357
+ | 0.7008 | 306.9955 | 46836 | 1.2587 |
358
+ | 0.6975 | 307.9984 | 46989 | 1.2580 |
359
+ | 0.6984 | 308.9947 | 47141 | 1.2535 |
360
+ | 0.6946 | 309.9975 | 47294 | 1.2576 |
361
+ | 0.6982 | 310.9939 | 47446 | 1.2610 |
362
+ | 0.6922 | 311.9967 | 47599 | 1.2632 |
363
+ | 0.694 | 312.9996 | 47752 | 1.2518 |
364
+ | 0.6967 | 313.9959 | 47904 | 1.2588 |
365
+ | 0.6895 | 314.9988 | 48057 | 1.2643 |
366
+ | 0.6954 | 315.9951 | 48209 | 1.2630 |
367
+ | 0.6899 | 316.9980 | 48362 | 1.2620 |
368
+ | 0.6932 | 317.9943 | 48514 | 1.2606 |
369
+ | 0.6878 | 318.9971 | 48667 | 1.2632 |
370
+ | 0.6895 | 320.0 | 48820 | 1.2623 |
371
+ | 0.6916 | 320.9963 | 48972 | 1.2665 |
372
+ | 0.6873 | 321.9992 | 49125 | 1.2636 |
373
+ | 0.6914 | 322.9955 | 49277 | 1.2631 |
374
+ | 0.6852 | 323.9984 | 49430 | 1.2631 |
375
+ | 0.6891 | 324.9947 | 49582 | 1.2628 |
376
+ | 0.6843 | 325.9975 | 49735 | 1.2654 |
377
+ | 0.6875 | 326.9939 | 49887 | 1.2656 |
378
+ | 0.6818 | 327.9967 | 50040 | 1.2660 |
379
+ | 0.683 | 328.9996 | 50193 | 1.2654 |
380
+ | 0.6866 | 329.9959 | 50345 | 1.2701 |
381
+ | 0.6803 | 330.9988 | 50498 | 1.2647 |
382
+ | 0.6843 | 331.9951 | 50650 | 1.2735 |
383
+ | 0.68 | 332.9980 | 50803 | 1.2663 |
384
+ | 0.6836 | 333.9943 | 50955 | 1.2659 |
385
+ | 0.6792 | 334.9971 | 51108 | 1.2723 |
386
+ | 0.6775 | 336.0 | 51261 | 1.2719 |
387
+ | 0.681 | 336.9963 | 51413 | 1.2684 |
388
+ | 0.6772 | 337.9992 | 51566 | 1.2722 |
389
+ | 0.6806 | 338.9955 | 51718 | 1.2745 |
390
+ | 0.6749 | 339.9984 | 51871 | 1.2762 |
391
+ | 0.6778 | 340.9947 | 52023 | 1.2767 |
392
+ | 0.6752 | 341.9975 | 52176 | 1.2727 |
393
+ | 0.6783 | 342.9939 | 52328 | 1.2757 |
394
+ | 0.6725 | 343.9967 | 52481 | 1.2732 |
395
+ | 0.6744 | 344.9996 | 52634 | 1.2728 |
396
+ | 0.6756 | 345.9959 | 52786 | 1.2736 |
397
+ | 0.6709 | 346.9988 | 52939 | 1.2731 |
398
+ | 0.6763 | 347.9951 | 53091 | 1.2749 |
399
+ | 0.6708 | 348.9980 | 53244 | 1.2774 |
400
+ | 0.673 | 349.9943 | 53396 | 1.2710 |
401
+ | 0.6685 | 350.9971 | 53549 | 1.2692 |
402
+ | 0.6677 | 352.0 | 53702 | 1.2675 |
403
+ | 0.6711 | 352.9963 | 53854 | 1.2767 |
404
+ | 0.6683 | 353.9992 | 54007 | 1.2760 |
405
+ | 0.6732 | 354.9955 | 54159 | 1.2743 |
406
+ | 0.6676 | 355.9984 | 54312 | 1.2797 |
407
+ | 0.6713 | 356.9947 | 54464 | 1.2764 |
408
+ | 0.6651 | 357.9975 | 54617 | 1.2807 |
409
+ | 0.6689 | 358.9939 | 54769 | 1.2758 |
410
+ | 0.6632 | 359.9967 | 54922 | 1.2839 |
411
+ | 0.6632 | 360.9996 | 55075 | 1.2807 |
412
+ | 0.6659 | 361.9959 | 55227 | 1.2760 |
413
+ | 0.6622 | 362.9988 | 55380 | 1.2812 |
414
+ | 0.6669 | 363.9951 | 55532 | 1.2761 |
415
+ | 0.6616 | 364.9980 | 55685 | 1.2868 |
416
+ | 0.6656 | 365.9943 | 55837 | 1.2766 |
417
+ | 0.6606 | 366.9971 | 55990 | 1.2851 |
418
+ | 0.659 | 368.0 | 56143 | 1.2815 |
419
+ | 0.665 | 368.9963 | 56295 | 1.2810 |
420
+ | 0.6585 | 369.9992 | 56448 | 1.2818 |
421
+ | 0.6636 | 370.9955 | 56600 | 1.2826 |
422
+ | 0.658 | 371.9984 | 56753 | 1.2799 |
423
+ | 0.6633 | 372.9947 | 56905 | 1.2915 |
424
+ | 0.657 | 373.9975 | 57058 | 1.2803 |
425
+ | 0.6623 | 374.9939 | 57210 | 1.2872 |
426
+ | 0.6561 | 375.9967 | 57363 | 1.2847 |
427
+ | 0.656 | 376.9996 | 57516 | 1.2834 |
428
+ | 0.6595 | 377.9959 | 57668 | 1.2858 |
429
+ | 0.6546 | 378.9988 | 57821 | 1.2834 |
430
+ | 0.6572 | 379.9951 | 57973 | 1.2869 |
431
+ | 0.653 | 380.9980 | 58126 | 1.2772 |
432
+ | 0.6566 | 381.9943 | 58278 | 1.2936 |
433
+ | 0.6533 | 382.9971 | 58431 | 1.2910 |
434
+ | 0.6543 | 384.0 | 58584 | 1.2846 |
435
+ | 0.6555 | 384.9963 | 58736 | 1.2881 |
436
+ | 0.6508 | 385.9992 | 58889 | 1.2898 |
437
+ | 0.6547 | 386.9955 | 59041 | 1.2879 |
438
+ | 0.6496 | 387.9984 | 59194 | 1.2865 |
439
+ | 0.6531 | 388.9947 | 59346 | 1.2861 |
440
+ | 0.6481 | 389.9975 | 59499 | 1.2832 |
441
+ | 0.6539 | 390.9939 | 59651 | 1.2895 |
442
+ | 0.6476 | 391.9967 | 59804 | 1.2838 |
443
+ | 0.6489 | 392.9996 | 59957 | 1.2923 |
444
+ | 0.6519 | 393.9959 | 60109 | 1.2871 |
445
+ | 0.647 | 394.9988 | 60262 | 1.2846 |
446
+ | 0.6491 | 395.9951 | 60414 | 1.2914 |
447
+ | 0.6459 | 396.9980 | 60567 | 1.2886 |
448
+ | 0.6496 | 397.9943 | 60719 | 1.2891 |
449
+ | 0.6452 | 398.9971 | 60872 | 1.2861 |
450
+ | 0.6439 | 400.0 | 61025 | 1.2917 |
451
+ | 0.6484 | 400.9963 | 61177 | 1.2934 |
452
+ | 0.6446 | 401.9992 | 61330 | 1.2872 |
453
+ | 0.6493 | 402.9955 | 61482 | 1.2900 |
454
+ | 0.6423 | 403.9984 | 61635 | 1.2940 |
455
+ | 0.6469 | 404.9947 | 61787 | 1.2867 |
456
+ | 0.6412 | 405.9975 | 61940 | 1.2958 |
457
+ | 0.6468 | 406.9939 | 62092 | 1.2906 |
458
+ | 0.6428 | 407.9967 | 62245 | 1.2904 |
459
+ | 0.6409 | 408.9996 | 62398 | 1.2924 |
460
+ | 0.6464 | 409.9959 | 62550 | 1.2953 |
461
+ | 0.6404 | 410.9988 | 62703 | 1.2918 |
462
+ | 0.6452 | 411.9951 | 62855 | 1.2894 |
463
+ | 0.6406 | 412.9980 | 63008 | 1.2975 |
464
+ | 0.6442 | 413.9943 | 63160 | 1.2928 |
465
+ | 0.638 | 414.9971 | 63313 | 1.2948 |
466
+ | 0.6379 | 416.0 | 63466 | 1.2936 |
467
+ | 0.6416 | 416.9963 | 63618 | 1.2892 |
468
+ | 0.639 | 417.9992 | 63771 | 1.2959 |
469
+ | 0.6414 | 418.9955 | 63923 | 1.2940 |
470
+ | 0.6363 | 419.9984 | 64076 | 1.2949 |
471
+ | 0.6409 | 420.9947 | 64228 | 1.2943 |
472
+ | 0.6346 | 421.9975 | 64381 | 1.2974 |
473
+ | 0.6393 | 422.9939 | 64533 | 1.3000 |
474
+ | 0.6331 | 423.9967 | 64686 | 1.2944 |
475
+ | 0.636 | 424.9996 | 64839 | 1.2915 |
476
+ | 0.6383 | 425.9959 | 64991 | 1.2986 |
477
+ | 0.6338 | 426.9988 | 65144 | 1.2981 |
478
+ | 0.6378 | 427.9951 | 65296 | 1.2980 |
479
+ | 0.634 | 428.9980 | 65449 | 1.2958 |
480
+ | 0.6374 | 429.9943 | 65601 | 1.2959 |
481
+ | 0.6312 | 430.9971 | 65754 | 1.2918 |
482
+ | 0.6317 | 432.0 | 65907 | 1.2972 |
483
+ | 0.6352 | 432.9963 | 66059 | 1.2970 |
484
+ | 0.6319 | 433.9992 | 66212 | 1.2969 |
485
+ | 0.6334 | 434.9955 | 66364 | 1.2997 |
486
+ | 0.6296 | 435.9984 | 66517 | 1.2967 |
487
+ | 0.6352 | 436.9947 | 66669 | 1.2979 |
488
+ | 0.6302 | 437.9975 | 66822 | 1.2999 |
489
+ | 0.6323 | 438.9939 | 66974 | 1.2989 |
490
+ | 0.6287 | 439.9967 | 67127 | 1.2933 |
491
+ | 0.6295 | 440.9996 | 67280 | 1.2979 |
492
+ | 0.6335 | 441.9959 | 67432 | 1.2979 |
493
+ | 0.6273 | 442.9988 | 67585 | 1.2917 |
494
+ | 0.6308 | 443.9951 | 67737 | 1.3001 |
495
+ | 0.6278 | 444.9980 | 67890 | 1.2948 |
496
+ | 0.6303 | 445.9943 | 68042 | 1.3005 |
497
+ | 0.6278 | 446.9971 | 68195 | 1.2962 |
498
+ | 0.6274 | 448.0 | 68348 | 1.2969 |
499
+ | 0.6287 | 448.9963 | 68500 | 1.2953 |
500
+ | 0.6276 | 449.9992 | 68653 | 1.2983 |
501
+ | 0.629 | 450.9955 | 68805 | 1.3040 |
502
+ | 0.6249 | 451.9984 | 68958 | 1.2992 |
503
+ | 0.6307 | 452.9947 | 69110 | 1.2992 |
504
+ | 0.626 | 453.9975 | 69263 | 1.2975 |
505
+ | 0.6283 | 454.9939 | 69415 | 1.2983 |
506
+ | 0.6262 | 455.9967 | 69568 | 1.3002 |
507
+ | 0.6217 | 456.9996 | 69721 | 1.3029 |
508
+ | 0.6284 | 457.9959 | 69873 | 1.3001 |
509
+ | 0.6238 | 458.9988 | 70026 | 1.3011 |
510
+ | 0.6258 | 459.9951 | 70178 | 1.2993 |
511
+ | 0.6217 | 460.9980 | 70331 | 1.2971 |
512
+ | 0.6265 | 461.9943 | 70483 | 1.2996 |
513
+ | 0.622 | 462.9971 | 70636 | 1.2977 |
514
+ | 0.6228 | 464.0 | 70789 | 1.2981 |
515
+ | 0.6274 | 464.9963 | 70941 | 1.3028 |
516
+ | 0.6218 | 465.9992 | 71094 | 1.2995 |
517
+ | 0.6245 | 466.9955 | 71246 | 1.2990 |
518
+ | 0.621 | 467.9984 | 71399 | 1.3032 |
519
+ | 0.6254 | 468.9947 | 71551 | 1.2992 |
520
+ | 0.6217 | 469.9975 | 71704 | 1.2964 |
521
+ | 0.6236 | 470.9939 | 71856 | 1.3012 |
522
+ | 0.6216 | 471.9967 | 72009 | 1.3004 |
523
+ | 0.6191 | 472.9996 | 72162 | 1.3032 |
524
+ | 0.6234 | 473.9959 | 72314 | 1.3043 |
525
+ | 0.6202 | 474.9988 | 72467 | 1.3015 |
526
+ | 0.6248 | 475.9951 | 72619 | 1.3018 |
527
+ | 0.6194 | 476.9980 | 72772 | 1.3030 |
528
+ | 0.6217 | 477.9943 | 72924 | 1.3040 |
529
+ | 0.6193 | 478.9971 | 73077 | 1.3058 |
530
+ | 0.6198 | 480.0 | 73230 | 1.2999 |
531
+ | 0.6219 | 480.9963 | 73382 | 1.3016 |
532
+ | 0.6165 | 481.9992 | 73535 | 1.3048 |
533
+ | 0.6223 | 482.9955 | 73687 | 1.3044 |
534
+ | 0.6165 | 483.9984 | 73840 | 1.3040 |
535
+ | 0.6223 | 484.9947 | 73992 | 1.3059 |
536
+ | 0.6179 | 485.9975 | 74145 | 1.2996 |
537
+ | 0.621 | 486.9939 | 74297 | 1.3052 |
538
+ | 0.6173 | 487.9967 | 74450 | 1.3019 |
539
+ | 0.6179 | 488.9996 | 74603 | 1.3009 |
540
+ | 0.6195 | 489.9959 | 74755 | 1.3023 |
541
+ | 0.6177 | 490.9988 | 74908 | 1.2976 |
542
+ | 0.6214 | 491.9951 | 75060 | 1.3044 |
543
+ | 0.6168 | 492.9980 | 75213 | 1.3022 |
544
+ | 0.6189 | 493.9943 | 75365 | 1.3029 |
545
+ | 0.6182 | 494.9971 | 75518 | 1.3043 |
546
+ | 0.6168 | 496.0 | 75671 | 1.3027 |
547
+ | 0.6222 | 496.9963 | 75823 | 1.3022 |
548
+ | 0.6155 | 497.9992 | 75976 | 1.3005 |
549
+ | 0.6144 | 498.1565 | 76000 | 1.2966 |
550
+
551
+
552
+ ### Framework versions
553
+
554
+ - Transformers 4.41.2
555
+ - Pytorch 2.2.1
556
+ - Datasets 2.19.2
557
+ - Tokenizers 0.19.1
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "allenai/scibert_scivocab_uncased",
3
+ "architectures": [
4
+ "BertForMaskedLM"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 3072,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.41.2",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 31090
25
+ }
generation_config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "pad_token_id": 0,
4
+ "transformers_version": "4.41.2"
5
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c5ce5beb5886fd00cf5def3a1f0bf54fdb7624dc9383c0e3c75181dcf982c1a
3
+ size 439828064
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "101": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "102": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "103": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 1000000000000000019884624838656,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ddf36eb533cffa8be6a589f758e4cc056956990477d0512d4abda0ba2739e9ad
3
+ size 5048
vocab.txt ADDED
The diff for this file is too large to render. See raw diff