|
--- |
|
license: mit |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- squad_v2 |
|
model-index: |
|
- name: muppet-roberta-base-finetuned-squad |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# muppet-roberta-base-finetuned-squad |
|
|
|
This model is a fine-tuned version of [facebook/muppet-roberta-base](https://huggingface.co/facebook/muppet-roberta-base) on the squad_v2 dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.9017 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 16 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 2 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | |
|
|:-------------:|:-----:|:-----:|:---------------:| |
|
| 0.7007 | 1.0 | 8239 | 0.7905 | |
|
| 0.4719 | 2.0 | 16478 | 0.9017 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.20.0 |
|
- Pytorch 1.11.0+cu113 |
|
- Datasets 2.3.2 |
|
- Tokenizers 0.12.1 |
|
|
|
## Model Recycling |
|
|
|
[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=1.81&mnli_lp=nan&20_newsgroup=-0.39&ag_news=-0.10&amazon_reviews_multi=0.58&anli=3.25&boolq=3.69&cb=14.38&cola=-1.65&copa=13.30&dbpedia=0.47&esnli=0.34&financial_phrasebank=0.49&imdb=0.22&isear=0.48&mnli=-0.43&mrpc=1.59&multirc=3.04&poem_sentiment=3.56&qnli=0.29&qqp=0.29&rotten_tomatoes=2.29&rte=11.35&sst2=1.87&sst_5bins=1.47&stsb=1.38&trec_coarse=-0.11&trec_fine=2.84&tweet_ev_emoji=0.16&tweet_ev_emotion=0.37&tweet_ev_hate=1.48&tweet_ev_irony=8.54&tweet_ev_offensive=0.33&tweet_ev_sentiment=0.82&wic=4.74&wnli=-15.35&wsc=0.19&yahoo_answers=-0.47&model_name=janeel%2Fmuppet-roberta-base-finetuned-squad&base_name=roberta-base) using janeel/muppet-roberta-base-finetuned-squad as a base model yields average score of 78.04 in comparison to 76.22 by roberta-base. |
|
|
|
The model is ranked 2nd among all tested models for the roberta-base architecture as of 21/12/2022 |
|
Results: |
|
|
|
| 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers | |
|
|---------------:|----------:|-----------------------:|--------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|--------:|----------------:| |
|
| 84.8911 | 89.6667 | 67.16 | 53.5937 | 82.3853 | 82.1429 | 81.8792 | 62 | 77.7667 | 91.3375 | 85.6 | 94.116 | 72.9465 | 86.5541 | 89.4608 | 64.2533 | 87.5 | 92.6963 | 91.0017 | 90.7129 | 83.7545 | 95.9862 | 58.1448 | 91.2944 | 97 | 90.6 | 46.464 | 82.1956 | 54.3771 | 80.102 | 84.8837 | 71.8496 | 70.2194 | 39.4366 | 63.4615 | 71.9333 | |
|
|
|
|
|
For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/) |
|
|