eladven commited on
Commit
ebf8984
1 Parent(s): 07ff0b3

Evaluation results for janeel/muppet-roberta-base-finetuned-squad model as a base model for other tasks

Browse files

As part of a research effort to identify high quality models in Huggingface that can serve as base models for further finetuning, we evaluated this by finetuning on 36 datasets. The model ranks 2nd among all tested models for the roberta-base architecture as of 21/12/2022.


To share this information with others in your model card, please add the following evaluation results to your README.md page.

For more information please see https://ibm.github.io/model-recycling/ or contact me.

Best regards,
Elad Venezian
[email protected]
IBM Research AI

Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -57,3 +57,17 @@ The following hyperparameters were used during training:
57
  - Pytorch 1.11.0+cu113
58
  - Datasets 2.3.2
59
  - Tokenizers 0.12.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  - Pytorch 1.11.0+cu113
58
  - Datasets 2.3.2
59
  - Tokenizers 0.12.1
60
+
61
+ ## Model Recycling
62
+
63
+ [Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=1.81&mnli_lp=nan&20_newsgroup=-0.39&ag_news=-0.10&amazon_reviews_multi=0.58&anli=3.25&boolq=3.69&cb=14.38&cola=-1.65&copa=13.30&dbpedia=0.47&esnli=0.34&financial_phrasebank=0.49&imdb=0.22&isear=0.48&mnli=-0.43&mrpc=1.59&multirc=3.04&poem_sentiment=3.56&qnli=0.29&qqp=0.29&rotten_tomatoes=2.29&rte=11.35&sst2=1.87&sst_5bins=1.47&stsb=1.38&trec_coarse=-0.11&trec_fine=2.84&tweet_ev_emoji=0.16&tweet_ev_emotion=0.37&tweet_ev_hate=1.48&tweet_ev_irony=8.54&tweet_ev_offensive=0.33&tweet_ev_sentiment=0.82&wic=4.74&wnli=-15.35&wsc=0.19&yahoo_answers=-0.47&model_name=janeel%2Fmuppet-roberta-base-finetuned-squad&base_name=roberta-base) using janeel/muppet-roberta-base-finetuned-squad as a base model yields average score of 78.04 in comparison to 76.22 by roberta-base.
64
+
65
+ The model is ranked 2nd among all tested models for the roberta-base architecture as of 21/12/2022
66
+ Results:
67
+
68
+ | 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers |
69
+ |---------------:|----------:|-----------------------:|--------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|--------:|----------------:|
70
+ | 84.8911 | 89.6667 | 67.16 | 53.5937 | 82.3853 | 82.1429 | 81.8792 | 62 | 77.7667 | 91.3375 | 85.6 | 94.116 | 72.9465 | 86.5541 | 89.4608 | 64.2533 | 87.5 | 92.6963 | 91.0017 | 90.7129 | 83.7545 | 95.9862 | 58.1448 | 91.2944 | 97 | 90.6 | 46.464 | 82.1956 | 54.3771 | 80.102 | 84.8837 | 71.8496 | 70.2194 | 39.4366 | 63.4615 | 71.9333 |
71
+
72
+
73
+ For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)