GPT-3.5 HumanEval_R CodeForces2305 contamination based on https://arxiv.org/abs/2402.15938 (#28)
Browse files- GPT-3.5 HumanEval_R CodeForces2305 contamination based on https://arxiv.org/abs/2402.15938 (42e416f06292d2cdca4d5374feb3737679a43426)
- Add PR number + postprocessing (1e54760087228b8b8102b4ef47180ea90571f032)
Co-authored-by: Suryansh Sharma <[email protected]>
- contamination_report.csv +6 -0
contamination_report.csv
CHANGED
|
@@ -6,6 +6,9 @@ Anagrams 1;;GPT-3;;model;;3.0;;data-based;https://arxiv.org/abs/2005.14165;13
|
|
| 6 |
|
| 7 |
Anagrams 2;;GPT-3;;model;;7.0;;data-based;https://arxiv.org/abs/2005.14165;13
|
| 8 |
|
|
|
|
|
|
|
|
|
|
| 9 |
Cycled Letters;;GPT-3;;model;;1.0;;data-based;https://arxiv.org/abs/2005.14165;13
|
| 10 |
|
| 11 |
EdinburghNLP/xsum;;GPT-3.5;;model;0.0;;100.0;model-based;https://arxiv.org/abs/2308.08493;3
|
|
@@ -17,6 +20,9 @@ EdinburghNLP/xsum;;allenai/c4;;corpus;;;15.49;data-based;https://arxiv.org/abs/2
|
|
| 17 |
|
| 18 |
EleutherAI/hendrycks_math;;GPT-4;;model;100.0;;;data-based;https://arxiv.org/abs/2303.08774;11
|
| 19 |
|
|
|
|
|
|
|
|
|
|
| 20 |
RadNLI;;GPT-3.5;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
|
| 21 |
RadNLI;;GPT-4;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
|
| 22 |
|
|
|
|
| 6 |
|
| 7 |
Anagrams 2;;GPT-3;;model;;7.0;;data-based;https://arxiv.org/abs/2005.14165;13
|
| 8 |
|
| 9 |
+
CodeForces2305;;GPT-3.5-turbo;0613;model;;;0.0;model-based;https://arxiv.org/abs/2402.15938;28
|
| 10 |
+
CodeForces2305;;GPT-3.5-turbo;1106;model;;;0.0;model-based;https://arxiv.org/abs/2402.15938;28
|
| 11 |
+
|
| 12 |
Cycled Letters;;GPT-3;;model;;1.0;;data-based;https://arxiv.org/abs/2005.14165;13
|
| 13 |
|
| 14 |
EdinburghNLP/xsum;;GPT-3.5;;model;0.0;;100.0;model-based;https://arxiv.org/abs/2308.08493;3
|
|
|
|
| 20 |
|
| 21 |
EleutherAI/hendrycks_math;;GPT-4;;model;100.0;;;data-based;https://arxiv.org/abs/2303.08774;11
|
| 22 |
|
| 23 |
+
HumanEval_R;;GPT-3.5-turbo;0613;model;;;9.76;model-based;https://arxiv.org/abs/2402.15938;28
|
| 24 |
+
HumanEval_R;;GPT-3.5-turbo;1106;model;;;10.97;model-based;https://arxiv.org/abs/2402.15938;28
|
| 25 |
+
|
| 26 |
RadNLI;;GPT-3.5;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
|
| 27 |
RadNLI;;GPT-4;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
|
| 28 |
|