ZhangRC commited on
Commit
1a6d14f
·
verified ·
1 Parent(s): 27e0c1e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -75,6 +75,7 @@ This model is trained on the World v2.8 with a total of 1.0 trillion tokens.
75
 
76
  - **Training regime:** bfloat16, lr 4e-4 to 1e-5 "delayed" cosine decay, wd 0.1 (with increasing batch sizes during the middle)
77
  - **Final Loss:** 1.9965
 
78
 
79
  ## Evaluation
80
 
 
75
 
76
  - **Training regime:** bfloat16, lr 4e-4 to 1e-5 "delayed" cosine decay, wd 0.1 (with increasing batch sizes during the middle)
77
  - **Final Loss:** 1.9965
78
+ - **Token Count:** 3.119 trillion
79
 
80
  ## Evaluation
81