mongrz commited on
Commit
bd00bcf
·
verified ·
1 Parent(s): 8980e69

Add model, config, and tokenizer

Browse files
Files changed (1) hide show
  1. README.md +15 -16
README.md CHANGED
@@ -9,10 +9,6 @@ metrics:
9
  model-index:
10
  - name: model_output
11
  results: []
12
- datasets:
13
- - ArielUW/jobtitles
14
- language:
15
- - pl
16
  ---
17
 
18
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -20,11 +16,11 @@ should probably proofread and complete it, then remove this comment. -->
20
 
21
  # model_output
22
 
23
- This model is a fine-tuned version of [facebook/m2m100_418M](https://huggingface.co/facebook/m2m100_418M) on [ArielUW/jobtitles dataset](https://huggingface.co/datasets/ArielUW/jobtitles).
24
  It achieves the following results on the evaluation set:
25
- - Loss: 0.0256
26
- - Bleu: 92.6792
27
- - Gen Len: 36.274
28
 
29
  ## Model description
30
 
@@ -44,20 +40,23 @@ More information needed
44
 
45
  The following hyperparameters were used during training:
46
  - learning_rate: 2e-05
47
- - train_batch_size: 8
48
- - eval_batch_size: 8
49
  - seed: 42
50
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 
 
 
51
  - lr_scheduler_type: linear
52
  - num_epochs: 2
53
  - mixed_precision_training: Native AMP
54
 
55
  ### Training results
56
 
57
- | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
58
- |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
59
- | 0.0066 | 1.0 | 604 | 0.0320 | 91.2302 | 36.104 |
60
- | 0.0033 | 2.0 | 1208 | 0.0256 | 92.6792 | 36.274 |
61
 
62
 
63
  ### Framework versions
@@ -65,4 +64,4 @@ The following hyperparameters were used during training:
65
  - Transformers 4.47.1
66
  - Pytorch 2.5.1+cu124
67
  - Datasets 3.2.0
68
- - Tokenizers 0.21.0
 
9
  model-index:
10
  - name: model_output
11
  results: []
 
 
 
 
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
16
 
17
  # model_output
18
 
19
+ This model is a fine-tuned version of [facebook/m2m100_418M](https://huggingface.co/facebook/m2m100_418M) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 3.2806
22
+ - Bleu: 89.9171
23
+ - Gen Len: 35.906
24
 
25
  ## Model description
26
 
 
40
 
41
  The following hyperparameters were used during training:
42
  - learning_rate: 2e-05
43
+ - train_batch_size: 16
44
+ - eval_batch_size: 16
45
  - seed: 42
46
+ - gradient_accumulation_steps: 4
47
+ - total_train_batch_size: 64
48
+ - optimizer: Use OptimizerNames.ADAFACTOR and the args are:
49
+ No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
  - num_epochs: 2
52
  - mixed_precision_training: Native AMP
53
 
54
  ### Training results
55
 
56
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
57
+ |:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|
58
+ | 20.8291 | 1.0 | 76 | 3.8806 | 87.1845 | 35.44 |
59
+ | 14.0408 | 1.9801 | 150 | 3.2806 | 89.9171 | 35.906 |
60
 
61
 
62
  ### Framework versions
 
64
  - Transformers 4.47.1
65
  - Pytorch 2.5.1+cu124
66
  - Datasets 3.2.0
67
+ - Tokenizers 0.21.0