Commit 
							
							·
						
						a0299eb
	
1
								Parent(s):
							
							1b9b03d
								
Adding Evaluation Results (#5)
Browse files- Adding Evaluation Results (c7721bc2f76131a9fa8c93a27186aad7a8da25da)
Co-authored-by: Open LLM Leaderboard PR Bot <[email protected]>
    	
        README.md
    CHANGED
    
    | @@ -128,3 +128,17 @@ The following hyperparameters were used during training: | |
| 128 | 
             
            - Pytorch 1.10.0+cu113
         | 
| 129 | 
             
            - Datasets 2.5.1
         | 
| 130 | 
             
            - Tokenizers 0.12.1
         | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 128 | 
             
            - Pytorch 1.10.0+cu113
         | 
| 129 | 
             
            - Datasets 2.5.1
         | 
| 130 | 
             
            - Tokenizers 0.12.1
         | 
| 131 | 
            +
             | 
| 132 | 
            +
            # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
         | 
| 133 | 
            +
            Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_postbot__gpt2-medium-emailgen)
         | 
| 134 | 
            +
             | 
| 135 | 
            +
            | Metric                | Value                     |
         | 
| 136 | 
            +
            |-----------------------|---------------------------|
         | 
| 137 | 
            +
            | Avg.                  | 25.97   |
         | 
| 138 | 
            +
            | ARC (25-shot)         | 26.45          |
         | 
| 139 | 
            +
            | HellaSwag (10-shot)   | 34.31    |
         | 
| 140 | 
            +
            | MMLU (5-shot)         | 24.1         |
         | 
| 141 | 
            +
            | TruthfulQA (0-shot)   | 43.96   |
         | 
| 142 | 
            +
            | Winogrande (5-shot)   | 50.43   |
         | 
| 143 | 
            +
            | GSM8K (5-shot)        | 0.0        |
         | 
| 144 | 
            +
            | DROP (3-shot)         | 2.53         |
         | 

 
		