ToastyPigeon commited on
Commit
0c3f464
·
verified ·
1 Parent(s): e562cc0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -1
README.md CHANGED
@@ -1 +1,36 @@
1
- This is a double fine-tuned version of Mistral Small 24B Base 2501.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This is a double fine-tuned version of Mistral Small 24B Base 2501.
2
+
3
+ Stage 1 was shoving 30M tokens of human-writen story content into it using completion training ([ToastyPigeon/ms3-base-roselily](https://huggingface.co/ToastyPigeon/ms3-base-roselily)), which is about half of my WIP Roselily dataset (~60M tokens total).
4
+
5
+ Stage 2 was teaching it instruct (this model).
6
+
7
+ This model should accept (in theory) any of the following instruct formats:
8
+
9
+ **Tekken v7**
10
+ ```
11
+ [SYSTEM_PROMPT]{system prompt}[/SYSTEM_PROMPT][INST]{user message}[/INST]{assistant response}</s>
12
+ ```
13
+ **ChatML**
14
+ ```
15
+ <|im_start|>system
16
+ {system prompt}<|im_end|>
17
+ <|im_start|>user
18
+ {user message}<|im_end|>
19
+ <|im_start|>assistant
20
+ {assistant response}<|im_end|>
21
+ ```
22
+ **Fizzpaca**
23
+ ```
24
+ ### System:
25
+ {system prompt}
26
+
27
+ ### Instruction:
28
+ {user message}
29
+
30
+ ### Response:
31
+ {assistant response}</s>
32
+ ```
33
+
34
+ The Tekken tokens were already in the tokenizer. unused special tokens #20 and 21 were repurposed for the ChatML tokens. Fizzpaca did not add any.
35
+
36
+ You may need to add both `</s>` and `<|im_end|>` as stop tokens for it to work properly with all formats.