Spaces:

GerbilLab
/

README

Running

crumb commited on Mar 28, 2023

Commit

f2afbbe

1 Parent(s): 6b6b2c2

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -7,4 +7,11 @@ sdk: static
 pinned: false
 ---
-Unusable models, compute optimally 🔥. We hope that buy open-sourcing our compute-optimal trained models, that others can replicate our results and also make no use out of our unusable models. These models are not useful in the slightest, and don't benefit research.

 pinned: false
 ---
+Unusable models, compute optimally 🔥. We hope that buy open-sourcing our compute-optimal trained models, that others can replicate our results and also make no use out of our unusable models. These models are not useful in the slightest, and don't benefit research.
+- A-Class Models: (Chinchilla-Optimal) 20 x Million Params tokens in training set.
+- B-Class Models: 42 x Million Params tokens in training set.
+- C-Class Models: 76 x Million Params tokens in training set.
+- D-Class Models: 142 x Million Params tokens in training set.
+The B, C, and D classes are derived from the tokens per model ratio from LLaMA, as LLaMA 65B is nearly Chinchilla-optimal with a ratio of 21 x Million Params tokens in training. Descending down the model sizes per training set for each model gives us these classes.