metadata
license: apache-2.0
WIP
This is an early experiment on laser and its influence on language understanding.
Will keep you updated..
German benchmarks
German tasks: | MMLU-DE | Hellaswag-DE | ARC-DE | Average |
---|---|---|---|---|
Models / Few-shots: | (5 shots) | (10 shots) | (24 shots) | |
7B parameters | ||||
llama-2-7b | 0.400 | 0.513 | 0.381 | 0.431 |
leo-hessianai-7b | 0.400 | 0.609 | 0.429 | 0.479 |
bloom-6b4-clp-german | 0.274 | 0.550 | 0.351 | 0.392 |
mistral-7b | 0.524 | 0.588 | 0.473 | 0.528 |
leo-mistral-hessianai-7b | 0.481 | 0.663 | 0.485 | 0.543 |
leo-mistral-hessianai-7b-chat | 0.458 | 0.617 | 0.465 | 0.513 |
DPOpenHermes-7B-v2 | 0.517 | 0.603 | 0.515 | 0.545 |
hermeo-7b | 0.511 | 0.668 | 0.528 | 0.569 |
germeo-7b-laser (this model) | ? | ? | ? | ? |
13B parameters | ||||
llama-2-13b | 0.469 | 0.581 | 0.468 | 0.506 |
leo-hessianai-13b | 0.486 | 0.658 | 0.509 | 0.551 |
70B parameters | ||||
llama-2-70b | 0.597 | 0.674 | 0.561 | 0.611 |
leo-hessianai-70b | 0.653 | 0.721 | 0.600 | 0.658 |