Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,8 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
-
| Model Name | Parameters | Class | Ratio | Tokens | Batch Size (Tokens) | Training Loss ↓ |
|
5 |
-
| --- | --- | --- | --- | --- | --- | --- |
|
6 |
| [GerbilLab/GerbilBlender-A-15m](https://hf.co/GerbilLab/GerbilBlender-A-15m) | 15m | A-Class | 20 | 280M | 131k | 4.9642 |
|
7 |
|
8 |
"Blender" models, inspired by UL2 pretraining, are trained equally in fill-in-the-middle, causal modelling, and masked language modelling tasks. Special tokens for these models include:
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
| Model Name | Parameters | Class | Ratio | Tokens | Batch Size (Tokens) | Training Loss ↓ |
|
5 |
+
| --- | --- | --- | --- | --- | --- | --- |
|
6 |
| [GerbilLab/GerbilBlender-A-15m](https://hf.co/GerbilLab/GerbilBlender-A-15m) | 15m | A-Class | 20 | 280M | 131k | 4.9642 |
|
7 |
|
8 |
"Blender" models, inspired by UL2 pretraining, are trained equally in fill-in-the-middle, causal modelling, and masked language modelling tasks. Special tokens for these models include:
|