File size: 1,081 Bytes
4c48300
 
 
 
 
 
 
 
 
6f3595f
 
 
 
adc6ecf
6f3595f
 
adc6ecf
bdb7b0d
 
 
 
 
 
 
 
adc6ecf
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
---
title: README
emoji: 
colorFrom: pink
colorTo: red
sdk: static
pinned: false
---

# 🇩🇪 Welcome to GERTuraX

* GERTuraX is a series of pretrained encoder-only language models for German.

* The models are ELECTRA-based and pretrained with the [TEAMS](https://aclanthology.org/2021.findings-acl.219/) approach on the [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) corpus. 

* In total, three different models were trained and released with pretraining corpus sizes ranging from 147GB to 1.1TB.

# 📊 Models

The following models are available:

* [GERTuraX-1](https://huggingface.co/gerturax/gerturax-1)
* [GERTuraX-2](https://huggingface.co/gerturax/gerturax-2)
* [GERTuraX-3](https://huggingface.co/gerturax/gerturax-3)


# ❤️ Acknowledgements

GERTuraX is the outcome of the last 12 months of working with TPUs from the awesome [TRC program](https://sites.research.google/trc/about/) and the [TensorFlow Model Garden](https://github.com/tensorflow/models) library.

Many thanks for providing TPUs!

Made from Bavarian Oberland with ❤️ and 🥨.