Update README.md
Browse files
README.md
CHANGED
@@ -28,20 +28,22 @@ Finetuned mT5-Model for German sentence-level text-simplification.
|
|
28 |
|
29 |
### Training Data
|
30 |
|
31 |
-
[DEplain/DEplain-APA-sent](https://huggingface.co/datasets/DEplain/DEplain-APA-sent)
|
|
|
32 |
|
33 |
### Training Procedure
|
34 |
|
35 |
-
Parameter-efficient Fine-Tuning with LoRA
|
|
|
36 |
|
37 |
#### Training Hyperparameters
|
38 |
* Batch Size: 16
|
39 |
* Epochs: 1
|
40 |
-
* Learning Rate: 0
|
41 |
* Optimizer: Adafactor
|
42 |
|
43 |
#### LoRA Hyperparameters
|
44 |
* R: 32
|
45 |
* Alpha: 64
|
46 |
-
* Dropout:
|
47 |
* Target modules: all linear layers
|
|
|
28 |
|
29 |
### Training Data
|
30 |
|
31 |
+
[DEplain/DEplain-APA-sent](https://huggingface.co/datasets/DEplain/DEplain-APA-sent) \
|
32 |
+
Stodden et al. (2023):[arXiv:2305.18939](arXiv:2305.18939)
|
33 |
|
34 |
### Training Procedure
|
35 |
|
36 |
+
Parameter-efficient Fine-Tuning with LoRA. Vocabulary trimmed to 32.000 most frequent tokend for German.
|
37 |
+
|
38 |
|
39 |
#### Training Hyperparameters
|
40 |
* Batch Size: 16
|
41 |
* Epochs: 1
|
42 |
+
* Learning Rate: 0.001
|
43 |
* Optimizer: Adafactor
|
44 |
|
45 |
#### LoRA Hyperparameters
|
46 |
* R: 32
|
47 |
* Alpha: 64
|
48 |
+
* Dropout: 0.1
|
49 |
* Target modules: all linear layers
|