|
--- |
|
language: |
|
- de |
|
library_name: transformers |
|
base_model: google/mT5-base |
|
datasets: |
|
- DEplain/DEplain-APA-sent |
|
metrics: |
|
- sari |
|
- bleu |
|
- bertscore |
|
pipeline_tag: text2text-generation |
|
--- |
|
# Model Card for mT5-base-trimmed_deplain-apa |
|
|
|
Finetuned mT5-Model for German sentence-level text-simplification. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
- **Model type:** Encoder-Decoder-Transformer |
|
- **Language(s) (NLP):** German |
|
- **Finetuned from model:** google/mT5-base |
|
- **Task**: Text-Simplification |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
[DEplain/DEplain-APA-sent](https://huggingface.co/datasets/DEplain/DEplain-APA-sent) \ |
|
Stodden et al. (2023):[arXiv:2305.18939](arXiv:2305.18939) |
|
|
|
### Training Procedure |
|
|
|
Parameter-efficient Fine-Tuning with LoRA. Vocabulary trimmed to 32.000 most frequent tokens for German. |
|
|
|
|
|
#### Training Hyperparameters |
|
* Batch Size: 16 |
|
* Epochs: 1 |
|
* Learning Rate: 0.001 |
|
* Optimizer: Adafactor |
|
|
|
#### LoRA Hyperparameters |
|
* R: 32 |
|
* Alpha: 64 |
|
* Dropout: 0.1 |
|
* Target modules: all linear layers |