|
--- |
|
language: |
|
- en |
|
tags: |
|
- question-answering |
|
- qa |
|
license: "apache-2.0" |
|
datasets: |
|
- squad |
|
metrics: |
|
- squad |
|
--- |
|
# Description |
|
Trained on the SQuAD v1.1 dataset from the MRQA Shared Task. The public dev set was divided into two: one for dev and one for test. |
|
|
|
# Dev results: |
|
"eval_exact_match": 88.15914715400723, |
|
"eval_f1": 93.91715796563734, |
|
"eval_samples": 5291 |
|
|
|
# Test results: |
|
"test_exact_match": 86.52455272173582, |
|
"test_f1": 92.92134442432088 |
|
"predict_samples": 5294 |
|
|
|
More info in the paper: |
|
**MetaQA: Combining Expert Agents for Multi-Skill Question Answering** |
|
https://arxiv.org/abs/2112.01922 |