mHossain commited on
Commit
ca5402d
·
1 Parent(s): 7b8b408

End of training

Browse files
Files changed (3) hide show
  1. README.md +23 -15
  2. generation_config.json +1 -1
  3. pytorch_model.bin +1 -1
README.md CHANGED
@@ -2,8 +2,6 @@
2
  base_model: csebuetnlp/mT5_m2m_crossSum
3
  tags:
4
  - generated_from_trainer
5
- metrics:
6
- - rouge
7
  model-index:
8
  - name: en_bn_summarize_v7
9
  results: []
@@ -16,12 +14,11 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [csebuetnlp/mT5_m2m_crossSum](https://huggingface.co/csebuetnlp/mT5_m2m_crossSum) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: nan
20
- - Rouge1: 0.0
21
- - Rouge2: 0.0
22
- - Rougel: 0.0
23
- - Rougelsum: 0.0
24
- - Gen Len: 28.8323
25
 
26
  ## Model description
27
 
@@ -41,19 +38,30 @@ More information needed
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 2e-05
44
- - train_batch_size: 2
45
- - eval_batch_size: 2
46
  - seed: 42
 
 
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: linear
49
- - lr_scheduler_warmup_steps: 5000
50
- - num_epochs: 1
51
 
52
  ### Training results
53
 
54
- | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
55
- |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
56
- | 0.0 | 1.0 | 615 | nan | 0.0 | 0.0 | 0.0 | 0.0 | 28.8323 |
 
 
 
 
 
 
 
 
 
57
 
58
 
59
  ### Framework versions
 
2
  base_model: csebuetnlp/mT5_m2m_crossSum
3
  tags:
4
  - generated_from_trainer
 
 
5
  model-index:
6
  - name: en_bn_summarize_v7
7
  results: []
 
14
 
15
  This model is a fine-tuned version of [csebuetnlp/mT5_m2m_crossSum](https://huggingface.co/csebuetnlp/mT5_m2m_crossSum) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
+ - Loss: 1.8058
18
+ - Rouge-1: 18.1261
19
+ - Rouge-2: 6.4386
20
+ - Rouge-l: 15.755
21
+ - Gen Len: 43.3354
 
22
 
23
  ## Model description
24
 
 
38
 
39
  The following hyperparameters were used during training:
40
  - learning_rate: 2e-05
41
+ - train_batch_size: 4
42
+ - eval_batch_size: 4
43
  - seed: 42
44
+ - gradient_accumulation_steps: 2
45
+ - total_train_batch_size: 8
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
+ - lr_scheduler_warmup_steps: 100
49
+ - num_epochs: 10
50
 
51
  ### Training results
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss | Rouge-1 | Rouge-2 | Rouge-l | Gen Len |
54
+ |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:-------:|
55
+ | 1.8582 | 1.0 | 154 | 1.8089 | 17.2361 | 6.3031 | 15.1651 | 42.4348 |
56
+ | 1.6492 | 2.0 | 308 | 1.7993 | 16.9045 | 6.083 | 14.6343 | 41.472 |
57
+ | 1.6278 | 3.0 | 462 | 1.8006 | 16.909 | 6.1661 | 14.6043 | 43.4969 |
58
+ | 1.5656 | 4.0 | 616 | 1.8016 | 17.1664 | 6.3668 | 15.0702 | 42.1925 |
59
+ | 1.5456 | 5.0 | 770 | 1.7983 | 16.8696 | 5.9485 | 14.729 | 42.2298 |
60
+ | 1.5146 | 6.0 | 924 | 1.8060 | 17.2806 | 5.98 | 14.7861 | 43.3602 |
61
+ | 1.4575 | 7.0 | 1078 | 1.8024 | 17.6126 | 6.1446 | 15.1649 | 43.3665 |
62
+ | 1.4988 | 8.0 | 1232 | 1.8046 | 17.619 | 6.1422 | 15.1738 | 43.3913 |
63
+ | 1.4637 | 9.0 | 1386 | 1.8059 | 17.6713 | 6.2475 | 15.3152 | 44.0621 |
64
+ | 1.4593 | 10.0 | 1540 | 1.8058 | 18.1261 | 6.4386 | 15.755 | 43.3354 |
65
 
66
 
67
  ### Framework versions
generation_config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "decoder_start_token_id": 250030,
3
  "eos_token_id": 1,
4
  "length_penalty": 0.6,
5
  "max_length": 84,
 
1
  {
2
+ "decoder_start_token_id": 250042,
3
  "eos_token_id": 1,
4
  "length_penalty": 0.6,
5
  "max_length": 84,
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2244e012cc34818a1338a7012480bb260d27cb3866f393819b5d1e69c56e1ba3
3
  size 2329702581
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:27104bd302f54e5ab14a2b3b8eca0af23f750bc667a411afb4ac92361e41125e
3
  size 2329702581