pszemraj commited on
Commit
a69dd7f
·
verified ·
1 Parent(s): 398075c

End of training

Browse files
Files changed (5) hide show
  1. README.md +3 -3
  2. all_results.json +16 -0
  3. eval_results.json +10 -0
  4. train_results.json +9 -0
  5. trainer_state.json +0 -0
README.md CHANGED
@@ -15,10 +15,10 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # mega-ar-525m-v0.06-fw_longish-UltraTextbooks-2.1-fw_mix-v2
17
 
18
- This model is a fine-tuned version of [pszemraj/mega-ar-525m-v0.06-fw_longish](https://huggingface.co/pszemraj/mega-ar-525m-v0.06-fw_longish) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 1.9851
21
- - Accuracy: 0.5870
22
 
23
  ## Model description
24
 
 
15
 
16
  # mega-ar-525m-v0.06-fw_longish-UltraTextbooks-2.1-fw_mix-v2
17
 
18
+ This model is a fine-tuned version of [pszemraj/mega-ar-525m-v0.06-fw_longish](https://huggingface.co/pszemraj/mega-ar-525m-v0.06-fw_longish) on the BEE-spoke-data/UltraTextbooks-2.1-fw_mix dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 1.9824
21
+ - Accuracy: 0.5874
22
 
23
  ## Model description
24
 
all_results.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 0.9999149532682873,
3
+ "eval_accuracy": 0.5874091441969519,
4
+ "eval_loss": 1.9824198484420776,
5
+ "eval_runtime": 86.7224,
6
+ "eval_samples": 400,
7
+ "eval_samples_per_second": 4.612,
8
+ "eval_steps_per_second": 1.153,
9
+ "perplexity": 7.26029054814878,
10
+ "total_flos": 6.861219031857234e+18,
11
+ "train_loss": 2.0199434388696393,
12
+ "train_runtime": 95478.6823,
13
+ "train_samples": 1363955,
14
+ "train_samples_per_second": 14.285,
15
+ "train_steps_per_second": 0.112
16
+ }
eval_results.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 0.9999149532682873,
3
+ "eval_accuracy": 0.5874091441969519,
4
+ "eval_loss": 1.9824198484420776,
5
+ "eval_runtime": 86.7224,
6
+ "eval_samples": 400,
7
+ "eval_samples_per_second": 4.612,
8
+ "eval_steps_per_second": 1.153,
9
+ "perplexity": 7.26029054814878
10
+ }
train_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 0.9999149532682873,
3
+ "total_flos": 6.861219031857234e+18,
4
+ "train_loss": 2.0199434388696393,
5
+ "train_runtime": 95478.6823,
6
+ "train_samples": 1363955,
7
+ "train_samples_per_second": 14.285,
8
+ "train_steps_per_second": 0.112
9
+ }
trainer_state.json ADDED
The diff for this file is too large to render. See raw diff