allura-org
/

MS3-24B-Roselily-Creative

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ToastyPigeon commited on 9 days ago

Commit

a1252d4

·

verified ·

1 Parent(s): 6e4c0f0

Update README.md

Files changed (1) hide show

README.md +36 -0

README.md CHANGED Viewed

@@ -10,3 +10,39 @@ tags:
 # todo
 make a model card and put a cute girl on it

 # todo
 make a model card and put a cute girl on it
+# some info
+Making this public so it can be tried and possibly merged if desired while I work on getting the energy to write a proper card.
+Short list of things to know:
+- Instruct format: ChatML or Alpaca preferred, Tekken v7 possible
+- ChatML tokens were assigned to unused tokens 20 and 21, this leaves all the tekken tokens intact so merges w/ tekken models are feasible
+- Instruct-tuning phase did include Tekken v7 so the tokens are initialized and recognized, but I did not continue with it on the creative step because I do not like it for creative stuff (too restrictive with turn order)
+- Feels a little less sensitive to samplers than Instruct-based MS3 models, but should probably still be used with conservative samplers
+# chat templates
+You may need to set `<|im_end|>` and/or `</s>` as stopping strings depending on which format you're using, the model generates both properly but tokenizers can be finicky about what they stop on by default
+Alpaca w/ System
+```
+### System:
+{system prompt}
+### Instruction:
+{user message}
+### Response:
+{model answer}</s>
+```
+ChatML
+```
+<|im_start|>system
+{system prompt}<|im_end|>
+<|im_start|>user
+{user message}<|im_end|>
+<|im_start|>assistant
+{model answer}<|im_end|>
+```
+Also saw some completion training in chat mode and adventure mode.