Update README.md
Browse files
README.md
CHANGED
@@ -10,3 +10,39 @@ tags:
|
|
10 |
# todo
|
11 |
|
12 |
make a model card and put a cute girl on it
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
# todo
|
11 |
|
12 |
make a model card and put a cute girl on it
|
13 |
+
|
14 |
+
# some info
|
15 |
+
|
16 |
+
Making this public so it can be tried and possibly merged if desired while I work on getting the energy to write a proper card.
|
17 |
+
|
18 |
+
Short list of things to know:
|
19 |
+
- Instruct format: ChatML or Alpaca preferred, Tekken v7 possible
|
20 |
+
- ChatML tokens were assigned to unused tokens 20 and 21, this leaves all the tekken tokens intact so merges w/ tekken models are feasible
|
21 |
+
- Instruct-tuning phase did include Tekken v7 so the tokens are initialized and recognized, but I did not continue with it on the creative step because I do not like it for creative stuff (too restrictive with turn order)
|
22 |
+
- Feels a little less sensitive to samplers than Instruct-based MS3 models, but should probably still be used with conservative samplers
|
23 |
+
|
24 |
+
# chat templates
|
25 |
+
|
26 |
+
You may need to set `<|im_end|>` and/or `</s>` as stopping strings depending on which format you're using, the model generates both properly but tokenizers can be finicky about what they stop on by default
|
27 |
+
|
28 |
+
Alpaca w/ System
|
29 |
+
```
|
30 |
+
### System:
|
31 |
+
{system prompt}
|
32 |
+
|
33 |
+
### Instruction:
|
34 |
+
{user message}
|
35 |
+
|
36 |
+
### Response:
|
37 |
+
{model answer}</s>
|
38 |
+
```
|
39 |
+
ChatML
|
40 |
+
```
|
41 |
+
<|im_start|>system
|
42 |
+
{system prompt}<|im_end|>
|
43 |
+
<|im_start|>user
|
44 |
+
{user message}<|im_end|>
|
45 |
+
<|im_start|>assistant
|
46 |
+
{model answer}<|im_end|>
|
47 |
+
```
|
48 |
+
Also saw some completion training in chat mode and adventure mode.
|