Update README.md
Browse files
README.md
CHANGED
@@ -51,10 +51,11 @@ classifier(sequence_to_classify, candidate_labels, hypothesis_template=hypothesi
|
|
51 |
|
52 |
## Training
|
53 |
This model was pre-trained on a set of 100 languages and follwed further training on 198M multilingual tweets as described in the original paper (https://arxiv.org/abs/2104.12250). Further it was trained on the training set of XNLI dataset in {insert target language} which is a machine translated version of the MNLI dataset. It was trained on 3 epochs and the following specifications
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
|
|
58 |
|
59 |
|
60 |
## Evaluation
|
|
|
51 |
|
52 |
## Training
|
53 |
This model was pre-trained on a set of 100 languages and follwed further training on 198M multilingual tweets as described in the original paper (https://arxiv.org/abs/2104.12250). Further it was trained on the training set of XNLI dataset in {insert target language} which is a machine translated version of the MNLI dataset. It was trained on 3 epochs and the following specifications
|
54 |
+
- learning rate: 5e-5
|
55 |
+
- batch size: 32
|
56 |
+
- max sequence: length 128
|
57 |
+
|
58 |
+
on one GPU (NVIDIA GeForce RTX 3090) resulting in a training time of 1h 47 mins.
|
59 |
|
60 |
|
61 |
## Evaluation
|