Safetensors
Japanese
llama_enc
custom_code
Onely7 commited on
Commit
13f3c1f
·
verified ·
1 Parent(s): 1edc31c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -90,7 +90,7 @@ Whole Word Masking 単語分割器には、[vibrato](https://github.com/daac-too
90
  | Batch Size (tokens) | 1146880 | 2293760 |
91
  | Max Learning Rate | 1.0E-4 | 1.0E-4 |
92
  | Min Learning Rate | 1.0E-6 | N/A |
93
- | Learning Rate Warmup Steps | 10000 | 10000 |
94
  | Scheduler | cosine | constant |
95
  | Optimizer | AdamW | AdamW |
96
  | Optimizer Config | beta_1 = 0.9, beta_2 = 0.999, eps = 1.0E-8 | beta_1 = 0.9, beta_2 = 0.999, eps = 1.0E-8 |
@@ -234,7 +234,7 @@ We only implemented Masked Language Modeling (MLM) during training, without Next
234
  | Batch Size (tokens) | 1146880 | 2293760 |
235
  | Max Learning Rate | 1.0E-4 | 1.0E-4 |
236
  | Min Learning Rate | 1.0E-6 | N/A |
237
- | Learning Rate Warmup Steps | 10000 | 10000 |
238
  | Scheduler | cosine | constant |
239
  | Optimizer | AdamW | AdamW |
240
  | Optimizer Config | beta_1 = 0.9, beta_2 = 0.999, eps = 1.0E-8 | beta_1 = 0.9, beta_2 = 0.999, eps = 1.0E-8 |
 
90
  | Batch Size (tokens) | 1146880 | 2293760 |
91
  | Max Learning Rate | 1.0E-4 | 1.0E-4 |
92
  | Min Learning Rate | 1.0E-6 | N/A |
93
+ | Learning Rate Warmup Steps | 10000 | N/A |
94
  | Scheduler | cosine | constant |
95
  | Optimizer | AdamW | AdamW |
96
  | Optimizer Config | beta_1 = 0.9, beta_2 = 0.999, eps = 1.0E-8 | beta_1 = 0.9, beta_2 = 0.999, eps = 1.0E-8 |
 
234
  | Batch Size (tokens) | 1146880 | 2293760 |
235
  | Max Learning Rate | 1.0E-4 | 1.0E-4 |
236
  | Min Learning Rate | 1.0E-6 | N/A |
237
+ | Learning Rate Warmup Steps | 10000 | N/A |
238
  | Scheduler | cosine | constant |
239
  | Optimizer | AdamW | AdamW |
240
  | Optimizer Config | beta_1 = 0.9, beta_2 = 0.999, eps = 1.0E-8 | beta_1 = 0.9, beta_2 = 0.999, eps = 1.0E-8 |