artfawl commited on
Commit
0301b27
·
verified ·
1 Parent(s): 723f1be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1326 -0
README.md CHANGED
@@ -53,6 +53,1332 @@ model-index:
53
  - type: iqm_normalized_95
54
  value: 0.99
55
  name: Normalized Score IQM (95% CI)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
  ---
57
  # Model Card for Vintix
58
 
 
53
  - type: iqm_normalized_95
54
  value: 0.99
55
  name: Normalized Score IQM (95% CI)
56
+ - task:
57
+ type: in-context-reinforcement-learning
58
+ name: In-Context Reinforcement Learning
59
+ dataset:
60
+ name: MuJoCo
61
+ type: ant_v4
62
+ metrics:
63
+ - type: total_reward
64
+ value: 6315.00 +/- 675.00
65
+ name: Total reward
66
+ - type: normalized_total_reward
67
+ value: 0.98 +/- 0.10
68
+ name: Expert normalized total reward
69
+ - task:
70
+ type: in-context-reinforcement-learning
71
+ name: In-Context Reinforcement Learning
72
+ dataset:
73
+ name: MuJoCo
74
+ type: halfcheetah_v4
75
+ metrics:
76
+ - type: total_reward
77
+ value: 7226.50 +/- 241.50
78
+ name: Total reward
79
+ - type: normalized_total_reward
80
+ value: 0.93 +/- 0.03
81
+ name: Expert normalized total reward
82
+ - task:
83
+ type: in-context-reinforcement-learning
84
+ name: In-Context Reinforcement Learning
85
+ dataset:
86
+ name: MuJoCo
87
+ type: hopper_v4
88
+ metrics:
89
+ - type: total_reward
90
+ value: 2794.60 +/- 612.62
91
+ name: Total reward
92
+ - type: normalized_total_reward
93
+ value: 0.86 +/- 0.19
94
+ name: Expert normalized total reward
95
+ - task:
96
+ type: in-context-reinforcement-learning
97
+ name: In-Context Reinforcement Learning
98
+ dataset:
99
+ name: MuJoCo
100
+ type: humanoid_v4
101
+ metrics:
102
+ - type: total_reward
103
+ value: 7376.26 +/- 0.00
104
+ name: Total reward
105
+ - type: normalized_total_reward
106
+ value: 0.97 +/- 0.00
107
+ name: Expert normalized total reward
108
+ - task:
109
+ type: in-context-reinforcement-learning
110
+ name: In-Context Reinforcement Learning
111
+ dataset:
112
+ name: MuJoCo
113
+ type: humanoidstandup_v4
114
+ metrics:
115
+ - type: total_reward
116
+ value: 320567.82 +/- 58462.11
117
+ name: Total reward
118
+ - type: normalized_total_reward
119
+ value: 1.02 +/- 0.21
120
+ name: Expert normalized total reward
121
+ - task:
122
+ type: in-context-reinforcement-learning
123
+ name: In-Context Reinforcement Learning
124
+ dataset:
125
+ name: MuJoCo
126
+ type: inverteddoublependulum_v4
127
+ metrics:
128
+ - type: total_reward
129
+ value: 6105.75 +/- 4368.65
130
+ name: Total reward
131
+ - type: normalized_total_reward
132
+ value: 0.65 +/- 0.47
133
+ name: Expert normalized total reward
134
+ - task:
135
+ type: in-context-reinforcement-learning
136
+ name: In-Context Reinforcement Learning
137
+ dataset:
138
+ name: MuJoCo
139
+ type: invertedpendulum_v4
140
+ metrics:
141
+ - type: total_reward
142
+ value: 1000.00 +/- 0.00
143
+ name: Total reward
144
+ - type: normalized_total_reward
145
+ value: 1.00 +/- 0.00
146
+ name: Expert normalized total reward
147
+ - task:
148
+ type: in-context-reinforcement-learning
149
+ name: In-Context Reinforcement Learning
150
+ dataset:
151
+ name: MuJoCo
152
+ type: pusher_v4
153
+ metrics:
154
+ - type: total_reward
155
+ value: -37.82 +/- 8.72
156
+ name: Total reward
157
+ - type: normalized_total_reward
158
+ value: 1.02 +/- 0.08
159
+ name: Expert normalized total reward
160
+ - task:
161
+ type: in-context-reinforcement-learning
162
+ name: In-Context Reinforcement Learning
163
+ dataset:
164
+ name: MuJoCo
165
+ type: reacher_v4
166
+ metrics:
167
+ - type: total_reward
168
+ value: -6.25 +/- 2.63
169
+ name: Total reward
170
+ - type: normalized_total_reward
171
+ value: 0.98 +/- 0.07
172
+ name: Expert normalized total reward
173
+ - task:
174
+ type: in-context-reinforcement-learning
175
+ name: In-Context Reinforcement Learning
176
+ dataset:
177
+ name: MuJoCo
178
+ type: swimmer_v4
179
+ metrics:
180
+ - type: total_reward
181
+ value: 93.20 +/- 5.40
182
+ name: Total reward
183
+ - type: normalized_total_reward
184
+ value: 0.98 +/- 0.06
185
+ name: Expert normalized total reward
186
+ - task:
187
+ type: in-context-reinforcement-learning
188
+ name: In-Context Reinforcement Learning
189
+ dataset:
190
+ name: MuJoCo
191
+ type: walker2d_v4
192
+ metrics:
193
+ - type: total_reward
194
+ value: 5400.00 +/- 107.95
195
+ name: Total reward
196
+ - type: normalized_total_reward
197
+ value: 1.00 +/- 0.02
198
+ name: Expert normalized total reward
199
+ - task:
200
+ type: in-context-reinforcement-learning
201
+ name: In-Context Reinforcement Learning
202
+ dataset:
203
+ name: Meta-World
204
+ type: assembly-v2
205
+ metrics:
206
+ - type: total_reward
207
+ value: 307.08 +/- 25.20
208
+ name: Total reward
209
+ - type: normalized_total_reward
210
+ value: 1.04 +/- 0.10
211
+ name: Expert normalized total reward
212
+ - task:
213
+ type: in-context-reinforcement-learning
214
+ name: In-Context Reinforcement Learning
215
+ dataset:
216
+ name: Meta-World
217
+ type: basketball-v2
218
+ metrics:
219
+ - type: total_reward
220
+ value: 568.04 +/- 60.72
221
+ name: Total reward
222
+ - type: normalized_total_reward
223
+ value: 1.02 +/- 0.11
224
+ name: Expert normalized total reward
225
+ - task:
226
+ type: in-context-reinforcement-learning
227
+ name: In-Context Reinforcement Learning
228
+ dataset:
229
+ name: Meta-World
230
+ type: bin-picking-v2
231
+ metrics:
232
+ - type: total_reward
233
+ value: 7.88 +/- 4.28
234
+ name: Total reward
235
+ - type: normalized_total_reward
236
+ value: 0.01 +/- 0.01
237
+ name: Expert normalized total reward
238
+ - task:
239
+ type: in-context-reinforcement-learning
240
+ name: In-Context Reinforcement Learning
241
+ dataset:
242
+ name: Meta-World
243
+ type: box-close-v2
244
+ metrics:
245
+ - type: total_reward
246
+ value: 61.75 +/- 13.54
247
+ name: Total reward
248
+ - type: normalized_total_reward
249
+ value: -0.04 +/- 0.03
250
+ name: Expert normalized total reward
251
+ - task:
252
+ type: in-context-reinforcement-learning
253
+ name: In-Context Reinforcement Learning
254
+ dataset:
255
+ name: Meta-World
256
+ type: button-press-v2
257
+ metrics:
258
+ - type: total_reward
259
+ value: 624.67 +/- 42.77
260
+ name: Total reward
261
+ - type: normalized_total_reward
262
+ value: 0.97 +/- 0.07
263
+ name: Expert normalized total reward
264
+ - task:
265
+ type: in-context-reinforcement-learning
266
+ name: In-Context Reinforcement Learning
267
+ dataset:
268
+ name: Meta-World
269
+ type: button-press-topdown-v2
270
+ metrics:
271
+ - type: total_reward
272
+ value: 449.36 +/- 62.16
273
+ name: Total reward
274
+ - type: normalized_total_reward
275
+ value: 0.94 +/- 0.14
276
+ name: Expert normalized total reward
277
+ - task:
278
+ type: in-context-reinforcement-learning
279
+ name: In-Context Reinforcement Learning
280
+ dataset:
281
+ name: Meta-World
282
+ type: button-press-topdown-wall-v2
283
+ metrics:
284
+ - type: total_reward
285
+ value: 482.08 +/- 32.48
286
+ name: Total reward
287
+ - type: normalized_total_reward
288
+ value: 0.97 +/- 0.07
289
+ name: Expert normalized total reward
290
+ - task:
291
+ type: in-context-reinforcement-learning
292
+ name: In-Context Reinforcement Learning
293
+ dataset:
294
+ name: Meta-World
295
+ type: button-press-wall-v2
296
+ metrics:
297
+ - type: total_reward
298
+ value: 672.00 +/- 26.48
299
+ name: Total reward
300
+ - type: normalized_total_reward
301
+ value: 1.00 +/- 0.04
302
+ name: Expert normalized total reward
303
+ - task:
304
+ type: in-context-reinforcement-learning
305
+ name: In-Context Reinforcement Learning
306
+ dataset:
307
+ name: Meta-World
308
+ type: coffee-button-v2
309
+ metrics:
310
+ - type: total_reward
311
+ value: 719.00 +/- 41.10
312
+ name: Total reward
313
+ - type: normalized_total_reward
314
+ value: 1.00 +/- 0.06
315
+ name: Expert normalized total reward
316
+ - task:
317
+ type: in-context-reinforcement-learning
318
+ name: In-Context Reinforcement Learning
319
+ dataset:
320
+ name: Meta-World
321
+ type: coffee-pull-v2
322
+ metrics:
323
+ - type: total_reward
324
+ value: 26.04 +/- 56.12
325
+ name: Total reward
326
+ - type: normalized_total_reward
327
+ value: 0.07 +/- 0.20
328
+ name: Expert normalized total reward
329
+ - task:
330
+ type: in-context-reinforcement-learning
331
+ name: In-Context Reinforcement Learning
332
+ dataset:
333
+ name: Meta-World
334
+ type: coffee-push-v2
335
+ metrics:
336
+ - type: total_reward
337
+ value: 571.01 +/- 112.28
338
+ name: Total reward
339
+ - type: normalized_total_reward
340
+ value: 1.01 +/- 0.20
341
+ name: Expert normalized total reward
342
+ - task:
343
+ type: in-context-reinforcement-learning
344
+ name: In-Context Reinforcement Learning
345
+ dataset:
346
+ name: Meta-World
347
+ type: dial-turn-v2
348
+ metrics:
349
+ - type: total_reward
350
+ value: 783.90 +/- 53.17
351
+ name: Total reward
352
+ - type: normalized_total_reward
353
+ value: 0.99 +/- 0.07
354
+ name: Expert normalized total reward
355
+ - task:
356
+ type: in-context-reinforcement-learning
357
+ name: In-Context Reinforcement Learning
358
+ dataset:
359
+ name: Meta-World
360
+ type: disassemble-v2
361
+ metrics:
362
+ - type: total_reward
363
+ value: 523.60 +/- 58.15
364
+ name: Total reward
365
+ - type: normalized_total_reward
366
+ value: 1.00 +/- 0.12
367
+ name: Expert normalized total reward
368
+ - task:
369
+ type: in-context-reinforcement-learning
370
+ name: In-Context Reinforcement Learning
371
+ dataset:
372
+ name: Meta-World
373
+ type: door-close-v2
374
+ metrics:
375
+ - type: total_reward
376
+ value: 538.10 +/- 25.76
377
+ name: Total reward
378
+ - type: normalized_total_reward
379
+ value: 1.02 +/- 0.05
380
+ name: Expert normalized total reward
381
+ - task:
382
+ type: in-context-reinforcement-learning
383
+ name: In-Context Reinforcement Learning
384
+ dataset:
385
+ name: Meta-World
386
+ type: door-lock-v2
387
+ metrics:
388
+ - type: total_reward
389
+ value: 356.51 +/- 249.44
390
+ name: Total reward
391
+ - type: normalized_total_reward
392
+ value: 0.35 +/- 0.36
393
+ name: Expert normalized total reward
394
+ - task:
395
+ type: in-context-reinforcement-learning
396
+ name: In-Context Reinforcement Learning
397
+ dataset:
398
+ name: Meta-World
399
+ type: door-open-v2
400
+ metrics:
401
+ - type: total_reward
402
+ value: 581.33 +/- 26.33
403
+ name: Total reward
404
+ - type: normalized_total_reward
405
+ value: 0.99 +/- 0.05
406
+ name: Expert normalized total reward
407
+ - task:
408
+ type: in-context-reinforcement-learning
409
+ name: In-Context Reinforcement Learning
410
+ dataset:
411
+ name: Meta-World
412
+ type: door-unlock-v2
413
+ metrics:
414
+ - type: total_reward
415
+ value: 352.86 +/- 147.78
416
+ name: Total reward
417
+ - type: normalized_total_reward
418
+ value: 0.21 +/- 0.26
419
+ name: Expert normalized total reward
420
+ - task:
421
+ type: in-context-reinforcement-learning
422
+ name: In-Context Reinforcement Learning
423
+ dataset:
424
+ name: Meta-World
425
+ type: drawer-close-v2
426
+ metrics:
427
+ - type: total_reward
428
+ value: 838.88 +/- 7.41
429
+ name: Total reward
430
+ - type: normalized_total_reward
431
+ value: 0.96 +/- 0.01
432
+ name: Expert normalized total reward
433
+ - task:
434
+ type: in-context-reinforcement-learning
435
+ name: In-Context Reinforcement Learning
436
+ dataset:
437
+ name: Meta-World
438
+ type: drawer-open-v2
439
+ metrics:
440
+ - type: total_reward
441
+ value: 493.00 +/- 3.57
442
+ name: Total reward
443
+ - type: normalized_total_reward
444
+ value: 1.00 +/- 0.01
445
+ name: Expert normalized total reward
446
+ - task:
447
+ type: in-context-reinforcement-learning
448
+ name: In-Context Reinforcement Learning
449
+ dataset:
450
+ name: Meta-World
451
+ type: faucet-close-v2
452
+ metrics:
453
+ - type: total_reward
454
+ value: 749.46 +/- 14.83
455
+ name: Total reward
456
+ - type: normalized_total_reward
457
+ value: 0.99 +/- 0.03
458
+ name: Expert normalized total reward
459
+ - task:
460
+ type: in-context-reinforcement-learning
461
+ name: In-Context Reinforcement Learning
462
+ dataset:
463
+ name: Meta-World
464
+ type: faucet-open-v2
465
+ metrics:
466
+ - type: total_reward
467
+ value: 732.47 +/- 15.23
468
+ name: Total reward
469
+ - type: normalized_total_reward
470
+ value: 0.97 +/- 0.03
471
+ name: Expert normalized total reward
472
+ - task:
473
+ type: in-context-reinforcement-learning
474
+ name: In-Context Reinforcement Learning
475
+ dataset:
476
+ name: Meta-World
477
+ type: hammer-v2
478
+ metrics:
479
+ - type: total_reward
480
+ value: 669.31 +/- 69.56
481
+ name: Total reward
482
+ - type: normalized_total_reward
483
+ value: 0.97 +/- 0.12
484
+ name: Expert normalized total reward
485
+ - task:
486
+ type: in-context-reinforcement-learning
487
+ name: In-Context Reinforcement Learning
488
+ dataset:
489
+ name: Meta-World
490
+ type: hand-insert-v2
491
+ metrics:
492
+ - type: total_reward
493
+ value: 142.81 +/- 146.64
494
+ name: Total reward
495
+ - type: normalized_total_reward
496
+ value: 0.19 +/- 0.20
497
+ name: Expert normalized total reward
498
+ - task:
499
+ type: in-context-reinforcement-learning
500
+ name: In-Context Reinforcement Learning
501
+ dataset:
502
+ name: Meta-World
503
+ type: handle-press-v2
504
+ metrics:
505
+ - type: total_reward
506
+ value: 835.30 +/- 114.19
507
+ name: Total reward
508
+ - type: normalized_total_reward
509
+ value: 1.00 +/- 0.15
510
+ name: Expert normalized total reward
511
+ - task:
512
+ type: in-context-reinforcement-learning
513
+ name: In-Context Reinforcement Learning
514
+ dataset:
515
+ name: Meta-World
516
+ type: handle-press-side-v2
517
+ metrics:
518
+ - type: total_reward
519
+ value: 852.96 +/- 16.08
520
+ name: Total reward
521
+ - type: normalized_total_reward
522
+ value: 0.99 +/- 0.02
523
+ name: Expert normalized total reward
524
+ - task:
525
+ type: in-context-reinforcement-learning
526
+ name: In-Context Reinforcement Learning
527
+ dataset:
528
+ name: Meta-World
529
+ type: handle-pull-v2
530
+ metrics:
531
+ - type: total_reward
532
+ value: 701.10 +/- 13.82
533
+ name: Total reward
534
+ - type: normalized_total_reward
535
+ value: 1.00 +/- 0.02
536
+ name: Expert normalized total reward
537
+ - task:
538
+ type: in-context-reinforcement-learning
539
+ name: In-Context Reinforcement Learning
540
+ dataset:
541
+ name: Meta-World
542
+ type: handle-pull-side-v2
543
+ metrics:
544
+ - type: total_reward
545
+ value: 493.10 +/- 53.65
546
+ name: Total reward
547
+ - type: normalized_total_reward
548
+ value: 1.00 +/- 0.11
549
+ name: Expert normalized total reward
550
+ - task:
551
+ type: in-context-reinforcement-learning
552
+ name: In-Context Reinforcement Learning
553
+ dataset:
554
+ name: Meta-World
555
+ type: lever-pull-v2
556
+ metrics:
557
+ - type: total_reward
558
+ value: 548.72 +/- 81.12
559
+ name: Total reward
560
+ - type: normalized_total_reward
561
+ value: 0.96 +/- 0.16
562
+ name: Expert normalized total reward
563
+ - task:
564
+ type: in-context-reinforcement-learning
565
+ name: In-Context Reinforcement Learning
566
+ dataset:
567
+ name: Meta-World
568
+ type: peg-insert-side-v2
569
+ metrics:
570
+ - type: total_reward
571
+ value: 352.43 +/- 137.24
572
+ name: Total reward
573
+ - type: normalized_total_reward
574
+ value: 1.01 +/- 0.40
575
+ name: Expert normalized total reward
576
+ - task:
577
+ type: in-context-reinforcement-learning
578
+ name: In-Context Reinforcement Learning
579
+ dataset:
580
+ name: Meta-World
581
+ type: peg-unplug-side-v2
582
+ metrics:
583
+ - type: total_reward
584
+ value: 401.52 +/- 175.27
585
+ name: Total reward
586
+ - type: normalized_total_reward
587
+ value: 0.75 +/- 0.34
588
+ name: Expert normalized total reward
589
+ - task:
590
+ type: in-context-reinforcement-learning
591
+ name: In-Context Reinforcement Learning
592
+ dataset:
593
+ name: Meta-World
594
+ type: pick-out-of-hole-v2
595
+ metrics:
596
+ - type: total_reward
597
+ value: 364.20 +/- 79.56
598
+ name: Total reward
599
+ - type: normalized_total_reward
600
+ value: 0.91 +/- 0.20
601
+ name: Expert normalized total reward
602
+ - task:
603
+ type: in-context-reinforcement-learning
604
+ name: In-Context Reinforcement Learning
605
+ dataset:
606
+ name: Meta-World
607
+ type: pick-place-v2
608
+ metrics:
609
+ - type: total_reward
610
+ value: 414.02 +/- 91.10
611
+ name: Total reward
612
+ - type: normalized_total_reward
613
+ value: 0.98 +/- 0.22
614
+ name: Expert normalized total reward
615
+ - task:
616
+ type: in-context-reinforcement-learning
617
+ name: In-Context Reinforcement Learning
618
+ dataset:
619
+ name: Meta-World
620
+ type: pick-place-wall-v2
621
+ metrics:
622
+ - type: total_reward
623
+ value: 553.18 +/- 84.72
624
+ name: Total reward
625
+ - type: normalized_total_reward
626
+ value: 1.04 +/- 0.16
627
+ name: Expert normalized total reward
628
+ - task:
629
+ type: in-context-reinforcement-learning
630
+ name: In-Context Reinforcement Learning
631
+ dataset:
632
+ name: Meta-World
633
+ type: plate-slide-v2
634
+ metrics:
635
+ - type: total_reward
636
+ value: 531.98 +/- 156.94
637
+ name: Total reward
638
+ - type: normalized_total_reward
639
+ value: 0.99 +/- 0.34
640
+ name: Expert normalized total reward
641
+ - task:
642
+ type: in-context-reinforcement-learning
643
+ name: In-Context Reinforcement Learning
644
+ dataset:
645
+ name: Meta-World
646
+ type: plate-slide-back-v2
647
+ metrics:
648
+ - type: total_reward
649
+ value: 703.93 +/- 108.27
650
+ name: Total reward
651
+ - type: normalized_total_reward
652
+ value: 0.99 +/- 0.16
653
+ name: Expert normalized total reward
654
+ - task:
655
+ type: in-context-reinforcement-learning
656
+ name: In-Context Reinforcement Learning
657
+ dataset:
658
+ name: Meta-World
659
+ type: plate-slide-back-side-v2
660
+ metrics:
661
+ - type: total_reward
662
+ value: 721.29 +/- 62.15
663
+ name: Total reward
664
+ - type: normalized_total_reward
665
+ value: 0.99 +/- 0.09
666
+ name: Expert normalized total reward
667
+ - task:
668
+ type: in-context-reinforcement-learning
669
+ name: In-Context Reinforcement Learning
670
+ dataset:
671
+ name: Meta-World
672
+ type: plate-slide-side-v2
673
+ metrics:
674
+ - type: total_reward
675
+ value: 578.24 +/- 143.73
676
+ name: Total reward
677
+ - type: normalized_total_reward
678
+ value: 0.83 +/- 0.22
679
+ name: Expert normalized total reward
680
+ - task:
681
+ type: in-context-reinforcement-learning
682
+ name: In-Context Reinforcement Learning
683
+ dataset:
684
+ name: Meta-World
685
+ type: push-v2
686
+ metrics:
687
+ - type: total_reward
688
+ value: 729.33 +/- 104.40
689
+ name: Total reward
690
+ - type: normalized_total_reward
691
+ value: 0.97 +/- 0.14
692
+ name: Expert normalized total reward
693
+ - task:
694
+ type: in-context-reinforcement-learning
695
+ name: In-Context Reinforcement Learning
696
+ dataset:
697
+ name: Meta-World
698
+ type: push-back-v2
699
+ metrics:
700
+ - type: total_reward
701
+ value: 372.16 +/- 112.75
702
+ name: Total reward
703
+ - type: normalized_total_reward
704
+ value: 0.95 +/- 0.29
705
+ name: Expert normalized total reward
706
+ - task:
707
+ type: in-context-reinforcement-learning
708
+ name: In-Context Reinforcement Learning
709
+ dataset:
710
+ name: Meta-World
711
+ type: push-wall-v2
712
+ metrics:
713
+ - type: total_reward
714
+ value: 741.68 +/- 14.84
715
+ name: Total reward
716
+ - type: normalized_total_reward
717
+ value: 0.99 +/- 0.02
718
+ name: Expert normalized total reward
719
+ - task:
720
+ type: in-context-reinforcement-learning
721
+ name: In-Context Reinforcement Learning
722
+ dataset:
723
+ name: Meta-World
724
+ type: reach-v2
725
+ metrics:
726
+ - type: total_reward
727
+ value: 684.45 +/- 136.55
728
+ name: Total reward
729
+ - type: normalized_total_reward
730
+ value: 1.01 +/- 0.26
731
+ name: Expert normalized total reward
732
+ - task:
733
+ type: in-context-reinforcement-learning
734
+ name: In-Context Reinforcement Learning
735
+ dataset:
736
+ name: Meta-World
737
+ type: reach-wall-v2
738
+ metrics:
739
+ - type: total_reward
740
+ value: 738.02 +/- 100.96
741
+ name: Total reward
742
+ - type: normalized_total_reward
743
+ value: 0.98 +/- 0.17
744
+ name: Expert normalized total reward
745
+ - task:
746
+ type: in-context-reinforcement-learning
747
+ name: In-Context Reinforcement Learning
748
+ dataset:
749
+ name: Meta-World
750
+ type: shelf-place-v2
751
+ metrics:
752
+ - type: total_reward
753
+ value: 268.34 +/- 29.07
754
+ name: Total reward
755
+ - type: normalized_total_reward
756
+ value: 1.01 +/- 0.11
757
+ name: Expert normalized total reward
758
+ - task:
759
+ type: in-context-reinforcement-learning
760
+ name: In-Context Reinforcement Learning
761
+ dataset:
762
+ name: Meta-World
763
+ type: soccer-v2
764
+ metrics:
765
+ - type: total_reward
766
+ value: 438.44 +/- 189.63
767
+ name: Total reward
768
+ - type: normalized_total_reward
769
+ value: 0.80 +/- 0.35
770
+ name: Expert normalized total reward
771
+ - task:
772
+ type: in-context-reinforcement-learning
773
+ name: In-Context Reinforcement Learning
774
+ dataset:
775
+ name: Meta-World
776
+ type: stick-pull-v2
777
+ metrics:
778
+ - type: total_reward
779
+ value: 483.98 +/- 83.25
780
+ name: Total reward
781
+ - type: normalized_total_reward
782
+ value: 0.92 +/- 0.16
783
+ name: Expert normalized total reward
784
+ - task:
785
+ type: in-context-reinforcement-learning
786
+ name: In-Context Reinforcement Learning
787
+ dataset:
788
+ name: Meta-World
789
+ type: stick-push-v2
790
+ metrics:
791
+ - type: total_reward
792
+ value: 563.07 +/- 173.40
793
+ name: Total reward
794
+ - type: normalized_total_reward
795
+ value: 0.90 +/- 0.28
796
+ name: Expert normalized total reward
797
+ - task:
798
+ type: in-context-reinforcement-learning
799
+ name: In-Context Reinforcement Learning
800
+ dataset:
801
+ name: Meta-World
802
+ type: sweep-v2
803
+ metrics:
804
+ - type: total_reward
805
+ value: 487.19 +/- 60.02
806
+ name: Total reward
807
+ - type: normalized_total_reward
808
+ value: 0.94 +/- 0.12
809
+ name: Expert normalized total reward
810
+ - task:
811
+ type: in-context-reinforcement-learning
812
+ name: In-Context Reinforcement Learning
813
+ dataset:
814
+ name: Meta-World
815
+ type: sweep-into-v2
816
+ metrics:
817
+ - type: total_reward
818
+ value: 798.80 +/- 15.62
819
+ name: Total reward
820
+ - type: normalized_total_reward
821
+ value: 1.00 +/- 0.02
822
+ name: Expert normalized total reward
823
+ - task:
824
+ type: in-context-reinforcement-learning
825
+ name: In-Context Reinforcement Learning
826
+ dataset:
827
+ name: Meta-World
828
+ type: window-close-v2
829
+ metrics:
830
+ - type: total_reward
831
+ value: 562.48 +/- 91.17
832
+ name: Total reward
833
+ - type: normalized_total_reward
834
+ value: 0.95 +/- 0.17
835
+ name: Expert normalized total reward
836
+ - task:
837
+ type: in-context-reinforcement-learning
838
+ name: In-Context Reinforcement Learning
839
+ dataset:
840
+ name: Meta-World
841
+ type: window-open-v2
842
+ metrics:
843
+ - type: total_reward
844
+ value: 573.69 +/- 93.98
845
+ name: Total reward
846
+ - type: normalized_total_reward
847
+ value: 0.96 +/- 0.17
848
+ name: Expert normalized total reward
849
+ - task:
850
+ type: in-context-reinforcement-learning
851
+ name: In-Context Reinforcement Learning
852
+ dataset:
853
+ name: Bi-DexHands
854
+ type: shadowhandblockstack
855
+ metrics:
856
+ - type: total_reward
857
+ value: 347.40 +/- 50.60
858
+ name: Total reward
859
+ - type: normalized_total_reward
860
+ value: 1.17 +/- 0.23
861
+ name: Expert normalized total reward
862
+ - task:
863
+ type: in-context-reinforcement-learning
864
+ name: In-Context Reinforcement Learning
865
+ dataset:
866
+ name: Bi-DexHands
867
+ type: shadowhandbottlecap
868
+ metrics:
869
+ - type: total_reward
870
+ value: 338.25 +/- 81.25
871
+ name: Total reward
872
+ - type: normalized_total_reward
873
+ value: 0.81 +/- 0.25
874
+ name: Expert normalized total reward
875
+ - task:
876
+ type: in-context-reinforcement-learning
877
+ name: In-Context Reinforcement Learning
878
+ dataset:
879
+ name: Bi-DexHands
880
+ type: shadowhandcatchabreast
881
+ metrics:
882
+ - type: total_reward
883
+ value: 11.81 +/- 21.28
884
+ name: Total reward
885
+ - type: normalized_total_reward
886
+ value: 0.17 +/- 0.32
887
+ name: Expert normalized total reward
888
+ - task:
889
+ type: in-context-reinforcement-learning
890
+ name: In-Context Reinforcement Learning
891
+ dataset:
892
+ name: Bi-DexHands
893
+ type: shadowhandcatchover2underarm
894
+ metrics:
895
+ - type: total_reward
896
+ value: 31.60 +/- 7.20
897
+ name: Total reward
898
+ - type: normalized_total_reward
899
+ value: 0.92 +/- 0.24
900
+ name: Expert normalized total reward
901
+ - task:
902
+ type: in-context-reinforcement-learning
903
+ name: In-Context Reinforcement Learning
904
+ dataset:
905
+ name: Bi-DexHands
906
+ type: shadowhandcatchunderarm
907
+ metrics:
908
+ - type: total_reward
909
+ value: 18.21 +/- 9.46
910
+ name: Total reward
911
+ - type: normalized_total_reward
912
+ value: 0.72 +/- 0.39
913
+ name: Expert normalized total reward
914
+ - task:
915
+ type: in-context-reinforcement-learning
916
+ name: In-Context Reinforcement Learning
917
+ dataset:
918
+ name: Bi-DexHands
919
+ type: shadowhanddoorcloseinward
920
+ metrics:
921
+ - type: total_reward
922
+ value: 3.97 +/- 0.15
923
+ name: Total reward
924
+ - type: normalized_total_reward
925
+ value: 0.36 +/- 0.02
926
+ name: Expert normalized total reward
927
+ - task:
928
+ type: in-context-reinforcement-learning
929
+ name: In-Context Reinforcement Learning
930
+ dataset:
931
+ name: Bi-DexHands
932
+ type: shadowhanddoorcloseoutward
933
+ metrics:
934
+ - type: total_reward
935
+ value: 358.50 +/- 4.50
936
+ name: Total reward
937
+ - type: normalized_total_reward
938
+ value: -1.27 +/- 0.01
939
+ name: Expert normalized total reward
940
+ - task:
941
+ type: in-context-reinforcement-learning
942
+ name: In-Context Reinforcement Learning
943
+ dataset:
944
+ name: Bi-DexHands
945
+ type: shadowhanddooropeninward
946
+ metrics:
947
+ - type: total_reward
948
+ value: 108.25 +/- 8.50
949
+ name: Total reward
950
+ - type: normalized_total_reward
951
+ value: 0.29 +/- 0.02
952
+ name: Expert normalized total reward
953
+ - task:
954
+ type: in-context-reinforcement-learning
955
+ name: In-Context Reinforcement Learning
956
+ dataset:
957
+ name: Bi-DexHands
958
+ type: shadowhanddooropenoutward
959
+ metrics:
960
+ - type: total_reward
961
+ value: 83.65 +/- 12.10
962
+ name: Total reward
963
+ - type: normalized_total_reward
964
+ value: 0.13 +/- 0.02
965
+ name: Expert normalized total reward
966
+ - task:
967
+ type: in-context-reinforcement-learning
968
+ name: In-Context Reinforcement Learning
969
+ dataset:
970
+ name: Bi-DexHands
971
+ type: shadowhandgraspandplace
972
+ metrics:
973
+ - type: total_reward
974
+ value: 485.15 +/- 89.10
975
+ name: Total reward
976
+ - type: normalized_total_reward
977
+ value: 0.97 +/- 0.18
978
+ name: Expert normalized total reward
979
+ - task:
980
+ type: in-context-reinforcement-learning
981
+ name: In-Context Reinforcement Learning
982
+ dataset:
983
+ name: Bi-DexHands
984
+ type: shadowhandkettle
985
+ metrics:
986
+ - type: total_reward
987
+ value: -450.47 +/- 0.00
988
+ name: Total reward
989
+ - type: normalized_total_reward
990
+ value: -0.99 +/- 0.00
991
+ name: Expert normalized total reward
992
+ - task:
993
+ type: in-context-reinforcement-learning
994
+ name: In-Context Reinforcement Learning
995
+ dataset:
996
+ name: Bi-DexHands
997
+ type: shadowhandliftunderarm
998
+ metrics:
999
+ - type: total_reward
1000
+ value: 377.92 +/- 13.24
1001
+ name: Total reward
1002
+ - type: normalized_total_reward
1003
+ value: 0.95 +/- 0.03
1004
+ name: Expert normalized total reward
1005
+ - task:
1006
+ type: in-context-reinforcement-learning
1007
+ name: In-Context Reinforcement Learning
1008
+ dataset:
1009
+ name: Bi-DexHands
1010
+ type: shadowhandover
1011
+ metrics:
1012
+ - type: total_reward
1013
+ value: 33.01 +/- 0.96
1014
+ name: Total reward
1015
+ - type: normalized_total_reward
1016
+ value: 0.95 +/- 0.03
1017
+ name: Expert normalized total reward
1018
+ - task:
1019
+ type: in-context-reinforcement-learning
1020
+ name: In-Context Reinforcement Learning
1021
+ dataset:
1022
+ name: Bi-DexHands
1023
+ type: shadowhandpen
1024
+ metrics:
1025
+ - type: total_reward
1026
+ value: 98.80 +/- 83.60
1027
+ name: Total reward
1028
+ - type: normalized_total_reward
1029
+ value: 0.52 +/- 0.44
1030
+ name: Expert normalized total reward
1031
+ - task:
1032
+ type: in-context-reinforcement-learning
1033
+ name: In-Context Reinforcement Learning
1034
+ dataset:
1035
+ name: Bi-DexHands
1036
+ type: shadowhandpushblock
1037
+ metrics:
1038
+ - type: total_reward
1039
+ value: 445.60 +/- 2.20
1040
+ name: Total reward
1041
+ - type: normalized_total_reward
1042
+ value: 0.98 +/- 0.01
1043
+ name: Expert normalized total reward
1044
+ - task:
1045
+ type: in-context-reinforcement-learning
1046
+ name: In-Context Reinforcement Learning
1047
+ dataset:
1048
+ name: Bi-DexHands
1049
+ type: shadowhandreorientation
1050
+ metrics:
1051
+ - type: total_reward
1052
+ value: 2798.00 +/- 2112.00
1053
+ name: Total reward
1054
+ - type: normalized_total_reward
1055
+ value: 0.89 +/- 0.66
1056
+ name: Expert normalized total reward
1057
+ - task:
1058
+ type: in-context-reinforcement-learning
1059
+ name: In-Context Reinforcement Learning
1060
+ dataset:
1061
+ name: Bi-DexHands
1062
+ type: shadowhandscissors
1063
+ metrics:
1064
+ - type: total_reward
1065
+ value: 747.95 +/- 7.65
1066
+ name: Total reward
1067
+ - type: normalized_total_reward
1068
+ value: 1.03 +/- 0.01
1069
+ name: Expert normalized total reward
1070
+ - task:
1071
+ type: in-context-reinforcement-learning
1072
+ name: In-Context Reinforcement Learning
1073
+ dataset:
1074
+ name: Bi-DexHands
1075
+ type: shadowhandswingcup
1076
+ metrics:
1077
+ - type: total_reward
1078
+ value: 3775.50 +/- 583.70
1079
+ name: Total reward
1080
+ - type: normalized_total_reward
1081
+ value: 0.95 +/- 0.13
1082
+ name: Expert normalized total reward
1083
+ - task:
1084
+ type: in-context-reinforcement-learning
1085
+ name: In-Context Reinforcement Learning
1086
+ dataset:
1087
+ name: Bi-DexHands
1088
+ type: shadowhandswitch
1089
+ metrics:
1090
+ - type: total_reward
1091
+ value: 268.25 +/- 2.35
1092
+ name: Total reward
1093
+ - type: normalized_total_reward
1094
+ value: 0.95 +/- 0.01
1095
+ name: Expert normalized total reward
1096
+ - task:
1097
+ type: in-context-reinforcement-learning
1098
+ name: In-Context Reinforcement Learning
1099
+ dataset:
1100
+ name: Bi-DexHands
1101
+ type: shadowhandtwocatchunderarm
1102
+ metrics:
1103
+ - type: total_reward
1104
+ value: 2.17 +/- 0.67
1105
+ name: Total reward
1106
+ - type: normalized_total_reward
1107
+ value: 0.03 +/- 0.03
1108
+ name: Expert normalized total reward
1109
+ - task:
1110
+ type: in-context-reinforcement-learning
1111
+ name: In-Context Reinforcement Learning
1112
+ dataset:
1113
+ name: Industrial-Benchmark
1114
+ type: industrial-benchmark-0-v1
1115
+ metrics:
1116
+ - type: total_reward
1117
+ value: -191.39 +/- 22.96
1118
+ name: Total reward
1119
+ - type: normalized_total_reward
1120
+ value: 0.94 +/- 0.13
1121
+ name: Expert normalized total reward
1122
+ - task:
1123
+ type: in-context-reinforcement-learning
1124
+ name: In-Context Reinforcement Learning
1125
+ dataset:
1126
+ name: Industrial-Benchmark
1127
+ type: industrial-benchmark-5-v1
1128
+ metrics:
1129
+ - type: total_reward
1130
+ value: -194.01 +/- 3.66
1131
+ name: Total reward
1132
+ - type: normalized_total_reward
1133
+ value: 1.00 +/- 0.02
1134
+ name: Expert normalized total reward
1135
+ - task:
1136
+ type: in-context-reinforcement-learning
1137
+ name: In-Context Reinforcement Learning
1138
+ dataset:
1139
+ name: Industrial-Benchmark
1140
+ type: industrial-benchmark-10-v1
1141
+ metrics:
1142
+ - type: total_reward
1143
+ value: -213.28 +/- 2.01
1144
+ name: Total reward
1145
+ - type: normalized_total_reward
1146
+ value: 1.01 +/- 0.01
1147
+ name: Expert normalized total reward
1148
+ - task:
1149
+ type: in-context-reinforcement-learning
1150
+ name: In-Context Reinforcement Learning
1151
+ dataset:
1152
+ name: Industrial-Benchmark
1153
+ type: industrial-benchmark-15-v1
1154
+ metrics:
1155
+ - type: total_reward
1156
+ value: -227.82 +/- 4.29
1157
+ name: Total reward
1158
+ - type: normalized_total_reward
1159
+ value: 1.01 +/- 0.02
1160
+ name: Expert normalized total reward
1161
+ - task:
1162
+ type: in-context-reinforcement-learning
1163
+ name: In-Context Reinforcement Learning
1164
+ dataset:
1165
+ name: Industrial-Benchmark
1166
+ type: industrial-benchmark-20-v1
1167
+ metrics:
1168
+ - type: total_reward
1169
+ value: -259.99 +/- 22.70
1170
+ name: Total reward
1171
+ - type: normalized_total_reward
1172
+ value: 0.95 +/- 0.11
1173
+ name: Expert normalized total reward
1174
+ - task:
1175
+ type: in-context-reinforcement-learning
1176
+ name: In-Context Reinforcement Learning
1177
+ dataset:
1178
+ name: Industrial-Benchmark
1179
+ type: industrial-benchmark-25-v1
1180
+ metrics:
1181
+ - type: total_reward
1182
+ value: -282.28 +/- 20.70
1183
+ name: Total reward
1184
+ - type: normalized_total_reward
1185
+ value: 0.95 +/- 0.11
1186
+ name: Expert normalized total reward
1187
+ - task:
1188
+ type: in-context-reinforcement-learning
1189
+ name: In-Context Reinforcement Learning
1190
+ dataset:
1191
+ name: Industrial-Benchmark
1192
+ type: industrial-benchmark-30-v1
1193
+ metrics:
1194
+ - type: total_reward
1195
+ value: -307.02 +/- 19.23
1196
+ name: Total reward
1197
+ - type: normalized_total_reward
1198
+ value: 0.90 +/- 0.10
1199
+ name: Expert normalized total reward
1200
+ - task:
1201
+ type: in-context-reinforcement-learning
1202
+ name: In-Context Reinforcement Learning
1203
+ dataset:
1204
+ name: Industrial-Benchmark
1205
+ type: industrial-benchmark-35-v1
1206
+ metrics:
1207
+ - type: total_reward
1208
+ value: -314.36 +/- 5.62
1209
+ name: Total reward
1210
+ - type: normalized_total_reward
1211
+ value: 1.00 +/- 0.03
1212
+ name: Expert normalized total reward
1213
+ - task:
1214
+ type: in-context-reinforcement-learning
1215
+ name: In-Context Reinforcement Learning
1216
+ dataset:
1217
+ name: Industrial-Benchmark
1218
+ type: industrial-benchmark-40-v1
1219
+ metrics:
1220
+ - type: total_reward
1221
+ value: -339.34 +/- 9.57
1222
+ name: Total reward
1223
+ - type: normalized_total_reward
1224
+ value: 0.99 +/- 0.05
1225
+ name: Expert normalized total reward
1226
+ - task:
1227
+ type: in-context-reinforcement-learning
1228
+ name: In-Context Reinforcement Learning
1229
+ dataset:
1230
+ name: Industrial-Benchmark
1231
+ type: industrial-benchmark-45-v1
1232
+ metrics:
1233
+ - type: total_reward
1234
+ value: -366.63 +/- 7.47
1235
+ name: Total reward
1236
+ - type: normalized_total_reward
1237
+ value: 0.97 +/- 0.04
1238
+ name: Expert normalized total reward
1239
+ - task:
1240
+ type: in-context-reinforcement-learning
1241
+ name: In-Context Reinforcement Learning
1242
+ dataset:
1243
+ name: Industrial-Benchmark
1244
+ type: industrial-benchmark-50-v1
1245
+ metrics:
1246
+ - type: total_reward
1247
+ value: -395.94 +/- 17.65
1248
+ name: Total reward
1249
+ - type: normalized_total_reward
1250
+ value: 0.91 +/- 0.09
1251
+ name: Expert normalized total reward
1252
+ - task:
1253
+ type: in-context-reinforcement-learning
1254
+ name: In-Context Reinforcement Learning
1255
+ dataset:
1256
+ name: Industrial-Benchmark
1257
+ type: industrial-benchmark-55-v1
1258
+ metrics:
1259
+ - type: total_reward
1260
+ value: -403.73 +/- 2.03
1261
+ name: Total reward
1262
+ - type: normalized_total_reward
1263
+ value: 0.99 +/- 0.01
1264
+ name: Expert normalized total reward
1265
+ - task:
1266
+ type: in-context-reinforcement-learning
1267
+ name: In-Context Reinforcement Learning
1268
+ dataset:
1269
+ name: Industrial-Benchmark
1270
+ type: industrial-benchmark-60-v1
1271
+ metrics:
1272
+ - type: total_reward
1273
+ value: -434.25 +/- 4.12
1274
+ name: Total reward
1275
+ - type: normalized_total_reward
1276
+ value: 0.98 +/- 0.02
1277
+ name: Expert normalized total reward
1278
+ - task:
1279
+ type: in-context-reinforcement-learning
1280
+ name: In-Context Reinforcement Learning
1281
+ dataset:
1282
+ name: Industrial-Benchmark
1283
+ type: industrial-benchmark-65-v1
1284
+ metrics:
1285
+ - type: total_reward
1286
+ value: -480.31 +/- 8.63
1287
+ name: Total reward
1288
+ - type: normalized_total_reward
1289
+ value: 0.86 +/- 0.04
1290
+ name: Expert normalized total reward
1291
+ - task:
1292
+ type: in-context-reinforcement-learning
1293
+ name: In-Context Reinforcement Learning
1294
+ dataset:
1295
+ name: Industrial-Benchmark
1296
+ type: industrial-benchmark-70-v1
1297
+ metrics:
1298
+ - type: total_reward
1299
+ value: -480.76 +/- 5.98
1300
+ name: Total reward
1301
+ - type: normalized_total_reward
1302
+ value: 0.95 +/- 0.03
1303
+ name: Expert normalized total reward
1304
+ - task:
1305
+ type: in-context-reinforcement-learning
1306
+ name: In-Context Reinforcement Learning
1307
+ dataset:
1308
+ name: Industrial-Benchmark
1309
+ type: industrial-benchmark-75-v1
1310
+ metrics:
1311
+ - type: total_reward
1312
+ value: -476.83 +/- 2.44
1313
+ name: Total reward
1314
+ - type: normalized_total_reward
1315
+ value: 0.99 +/- 0.01
1316
+ name: Expert normalized total reward
1317
+ - task:
1318
+ type: in-context-reinforcement-learning
1319
+ name: In-Context Reinforcement Learning
1320
+ dataset:
1321
+ name: Industrial-Benchmark
1322
+ type: industrial-benchmark-80-v1
1323
+ metrics:
1324
+ - type: total_reward
1325
+ value: -497.13 +/- 2.95
1326
+ name: Total reward
1327
+ - type: normalized_total_reward
1328
+ value: 0.96 +/- 0.01
1329
+ name: Expert normalized total reward
1330
+ - task:
1331
+ type: in-context-reinforcement-learning
1332
+ name: In-Context Reinforcement Learning
1333
+ dataset:
1334
+ name: Industrial-Benchmark
1335
+ type: industrial-benchmark-85-v1
1336
+ metrics:
1337
+ - type: total_reward
1338
+ value: -513.83 +/- 3.06
1339
+ name: Total reward
1340
+ - type: normalized_total_reward
1341
+ value: 0.98 +/- 0.01
1342
+ name: Expert normalized total reward
1343
+ - task:
1344
+ type: in-context-reinforcement-learning
1345
+ name: In-Context Reinforcement Learning
1346
+ dataset:
1347
+ name: Industrial-Benchmark
1348
+ type: industrial-benchmark-90-v1
1349
+ metrics:
1350
+ - type: total_reward
1351
+ value: -532.70 +/- 3.61
1352
+ name: Total reward
1353
+ - type: normalized_total_reward
1354
+ value: 0.97 +/- 0.01
1355
+ name: Expert normalized total reward
1356
+ - task:
1357
+ type: in-context-reinforcement-learning
1358
+ name: In-Context Reinforcement Learning
1359
+ dataset:
1360
+ name: Industrial-Benchmark
1361
+ type: industrial-benchmark-95-v1
1362
+ metrics:
1363
+ - type: total_reward
1364
+ value: -557.42 +/- 3.81
1365
+ name: Total reward
1366
+ - type: normalized_total_reward
1367
+ value: 0.97 +/- 0.01
1368
+ name: Expert normalized total reward
1369
+ - task:
1370
+ type: in-context-reinforcement-learning
1371
+ name: In-Context Reinforcement Learning
1372
+ dataset:
1373
+ name: Industrial-Benchmark
1374
+ type: industrial-benchmark-100-v1
1375
+ metrics:
1376
+ - type: total_reward
1377
+ value: -574.57 +/- 4.37
1378
+ name: Total reward
1379
+ - type: normalized_total_reward
1380
+ value: 0.97 +/- 0.01
1381
+ name: Expert normalized total reward
1382
  ---
1383
  # Model Card for Vintix
1384