metadata
datasets:
- togethercomputer/RedPajama-Data-V2
language:
- de
pipeline_tag: text-generation
German Tinyllama-120M
This is a German Tinyllama 120M language model trained from scratch using the the Tinyllama codebase on the German portion of RedPajama V2.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("LSX-UniWue/llamchen_120m")
tokenizer = AutoTokenizer.from_pretrained("LSX-UniWue/llamchen_120m")
Performance
We evaluated our model on the SuperGLEBer benchmark.
Task Type | Task Name | Metric | Score |
---|---|---|---|
Classification | NLI | Accuracy | 0.629 |
Classification | DB Aspect | micro F1 | 0.517 |
Sequence Tagging | NER Europarl | micro F1 | 0.538 |
Sentence Similarity | Pawsx | Pearson | 0.489 |
Question Answering | MLQA | F1 | 0.846 |