File size: 1,145 Bytes
8a1e684 8f2a9cb 8a1e684 c831f1f 8f2a9cb c831f1f 8f2a9cb c831f1f 75d06cc c831f1f 8f2a9cb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
---
datasets:
- togethercomputer/RedPajama-Data-V2
language:
- de
pipeline_tag: text-generation
---
# German Tinyllama-120M
This is a German Tinyllama 120M language model trained from scratch using the
the [Tinyllama](https://github.com/jzhang38/TinyLlama) codebase on the German portion of [RedPajama V2](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2).
### Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("LSX-UniWue/llamchen_120m")
tokenizer = AutoTokenizer.from_pretrained("LSX-UniWue/llamchen_120m")
```
### Performance
We evaluated our model on the [SuperGLEBer](https://lsx-uniwue.github.io/SuperGLEBer-site/) benchmark.
| Task Type | Task Name | Metric | Score |
|---------------------|--------------|----------|-------|
| Classification | NLI | Accuracy | 0.629 |
| Classification | DB Aspect | micro F1 | 0.517 |
| Sequence Tagging | NER Europarl | micro F1 | 0.538 |
| Sentence Similarity | Pawsx | Pearson | 0.489 |
| Question Answering | MLQA | F1 | 0.846 | |