End of training

Files changed (5) hide show

README.md CHANGED Viewed

@@ -15,10 +15,10 @@ should probably proofread and complete it, then remove this comment. -->
 # mega-ar-525m-v0.06-fw_longish-UltraTextbooks-2.1-fw_mix-v2
-This model is a fine-tuned version of [pszemraj/mega-ar-525m-v0.06-fw_longish](https://huggingface.co/pszemraj/mega-ar-525m-v0.06-fw_longish) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.9851
-- Accuracy: 0.5870
 ## Model description

 # mega-ar-525m-v0.06-fw_longish-UltraTextbooks-2.1-fw_mix-v2
+This model is a fine-tuned version of [pszemraj/mega-ar-525m-v0.06-fw_longish](https://huggingface.co/pszemraj/mega-ar-525m-v0.06-fw_longish) on the BEE-spoke-data/UltraTextbooks-2.1-fw_mix dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.9824
+- Accuracy: 0.5874
 ## Model description

all_results.json ADDED Viewed

+{
+    "epoch": 0.9999149532682873,
+    "eval_accuracy": 0.5874091441969519,
+    "eval_loss": 1.9824198484420776,
+    "eval_runtime": 86.7224,
+    "eval_samples": 400,
+    "eval_samples_per_second": 4.612,
+    "eval_steps_per_second": 1.153,
+    "perplexity": 7.26029054814878,
+    "total_flos": 6.861219031857234e+18,
+    "train_loss": 2.0199434388696393,
+    "train_runtime": 95478.6823,
+    "train_samples": 1363955,
+    "train_samples_per_second": 14.285,
+    "train_steps_per_second": 0.112
+}

eval_results.json ADDED Viewed

+{
+    "epoch": 0.9999149532682873,
+    "eval_accuracy": 0.5874091441969519,
+    "eval_loss": 1.9824198484420776,
+    "eval_runtime": 86.7224,
+    "eval_samples": 400,
+    "eval_samples_per_second": 4.612,
+    "eval_steps_per_second": 1.153,
+    "perplexity": 7.26029054814878
+}

train_results.json ADDED Viewed

+{
+    "epoch": 0.9999149532682873,
+    "total_flos": 6.861219031857234e+18,
+    "train_loss": 2.0199434388696393,
+    "train_runtime": 95478.6823,
+    "train_samples": 1363955,
+    "train_samples_per_second": 14.285,
+    "train_steps_per_second": 0.112
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff