Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
This is a double fine-tuned version of Mistral Small 24B Base 2501.
|
2 |
|
3 |
Stage 1 was shoving 30M tokens of human-writen story content into it using completion training ([ToastyPigeon/ms3-base-roselily](https://huggingface.co/ToastyPigeon/ms3-base-roselily)), which is about half of my WIP Roselily dataset (~60M tokens total).
|
@@ -33,4 +43,4 @@ This model should accept (in theory) any of the following instruct formats:
|
|
33 |
|
34 |
The Tekken tokens were already in the tokenizer. unused special tokens #20 and 21 were repurposed for the ChatML tokens. Fizzpaca did not add any.
|
35 |
|
36 |
-
You may need to add both `</s>` and `<|im_end|>` as stop tokens for it to work properly with all formats.
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- ToastyPigeon/roselily-v0
|
5 |
+
- PocketDoc/Dans-Systemmaxx
|
6 |
+
- allenai/tulu-3-sft-personas-instruction-following
|
7 |
+
- ZeusLabs/WizardLM_evol_instruct_fuzzy_dedup_sharegpt
|
8 |
+
base_model:
|
9 |
+
- mistralai/Mistral-Small-24B-Base-2501
|
10 |
+
---
|
11 |
This is a double fine-tuned version of Mistral Small 24B Base 2501.
|
12 |
|
13 |
Stage 1 was shoving 30M tokens of human-writen story content into it using completion training ([ToastyPigeon/ms3-base-roselily](https://huggingface.co/ToastyPigeon/ms3-base-roselily)), which is about half of my WIP Roselily dataset (~60M tokens total).
|
|
|
43 |
|
44 |
The Tekken tokens were already in the tokenizer. unused special tokens #20 and 21 were repurposed for the ChatML tokens. Fizzpaca did not add any.
|
45 |
|
46 |
+
You may need to add both `</s>` and `<|im_end|>` as stop tokens for it to work properly with all formats.
|