Spaces:
				
			
			
	
			
			
					
		Running
		
	
	
	
			
			
	
	
	
	
		
		
					
		Running
		
	
		Jae-Won Chung
		
	commited on
		
		
					Commit 
							
							·
						
						0787166
	
1
								Parent(s):
							
							069d87a
								
Add a section in Limitations
Browse files- LEADERBOARD.md +5 -0
    	
        LEADERBOARD.md
    CHANGED
    
    | @@ -61,6 +61,11 @@ See [here](https://github.com/ml-energy/leaderboard/tree/master/sharegpt) for mo | |
| 61 | 
             
            - `hellaswag`: [HellaSwag dataset](https://allenai.org/data/hellaswag), measuring grounded commonsense, 10 shot
         | 
| 62 | 
             
            - `truthfulqa`: [TruthfulQA dataset](https://arxiv.org/abs/2109.07958), measuring truthfulness against questions that elicit common falsehoods, 0 shot
         | 
| 63 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
| 64 | 
             
            ## Upcoming
         | 
| 65 |  | 
| 66 | 
             
            - Within the Summer, we'll add an LLM Arena for energy consumption!
         | 
|  | |
| 61 | 
             
            - `hellaswag`: [HellaSwag dataset](https://allenai.org/data/hellaswag), measuring grounded commonsense, 10 shot
         | 
| 62 | 
             
            - `truthfulqa`: [TruthfulQA dataset](https://arxiv.org/abs/2109.07958), measuring truthfulness against questions that elicit common falsehoods, 0 shot
         | 
| 63 |  | 
| 64 | 
            +
            ## Limitations
         | 
| 65 | 
            +
             | 
| 66 | 
            +
            Currently, inference is run with basically bare PyTorch with batch size 1, which is unrealistic assuming a production serving scenario.
         | 
| 67 | 
            +
            Hence, absolute latency, throughput, and energy numbers should not be used to estimate figures in real production settings, while relative comparison makes some sense.
         | 
| 68 | 
            +
             | 
| 69 | 
             
            ## Upcoming
         | 
| 70 |  | 
| 71 | 
             
            - Within the Summer, we'll add an LLM Arena for energy consumption!
         | 
