Update README.md

Browse files

Files changed (1) hide show

README.md +86 -26

README.md CHANGED Viewed

@@ -1,40 +1,100 @@
 ---
-base_model: []
 library_name: transformers
 tags:
-- mergekit
-- merge
 ---
-# MiS-Firefly-v0.2-22B
-This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
-## Merge Details
-### Merge Method
-This model was merged using the SLERP merge method.
-### Models Merged
-The following models were included in the merge:
-* /mnt/models/checkpoint-404
-* /mnt/models/checkpoint-808
-### Configuration
-The following YAML configuration was used to produce this model:
-```yaml
-dtype: bfloat16
-models:
-  - model: /mnt/models/checkpoint-808
-  - model: /mnt/models/checkpoint-404
-merge_method: slerp
-base_model: /mnt/models/checkpoint-808
-parameters:
-  t:
-    - value: [0, 0, 0.25, 0.35, 0.4, 0.45, 0.4, 0.35, 0.25, 0, 0]
-  embed_slerp: true
 ```

 ---
 library_name: transformers
 tags:
+- not-for-all-audiences
+- axolotl
+- qlora
+language:
+- en
+license: other
 ---
+<div align="center">
+  <b style="font-size: 36px;">MiS-Firefly-v0.2-22B (GGUF)</b>
+  <img src="https://huggingface.co/invisietch/MiS-Firefly-v0.2-22B-GGUF/resolve/main/header.png" style="width:60%">
+<b>HF</b> :
+<a href="https://huggingface.co/invisietch/MiS-Firefly-v0.2-22B">FP16</a>
+  &vert;
+<b>GGUF</b> :
+<a href="https://huggingface.co/invisietch/MiS-Firefly-v0.2-22B-GGUF">Static GGUF</a>
+</div>
+# Model Details
+**This is a fix for the quantization issue in Firefly v0.1.**
+Firefly is a Mistral Small 22B finetune designed for creative writing and roleplay. The model is largely uncensored and should support
+context up to 32,768 tokens.
+The model has been tested in various roleplay scenarios up to 16k context, as well as in a role as an assistant. It shows a broad
+competency &amp; coherence across various scenarios.
+Special thanks to <a href="https://huggingface.co/SicariusSicariiStuff">SicariusSicariiStuff</a> for bouncing ideas back &amp; forth on
+training, and <a href="https://huggingface.co/SytanSD">SytanSD</a> for quants.
+## KNOWN QUANTIZATION ISSUE
+Some quants seem to have issues with misspelling complicated names.
+This doesn't happen at fp16 or q8_0 even with very weird names and multiple swipes meaning something's getting lost in quant.
+Suggested workarounds:
+- If you can, run q8_0 (I'm told this fits on a 4090 with flash attention), I haven't seen the issue in ~900 messages on q8.
+- If not, try some lower quants (ideally imatrix), I haven't tested them all but it appears to happen on Q6_K the most and less often on the 6.5bpw EXL2. If you find one where this doesn't happen, tell me.
+- If none of that works, use a simpler name.
+I'll try resolving it with a light merge ASAP, it seems like the wrong weight is just getting truncated in quantization causing these issues.
+# Feedback
+I appreciate all feedback on any of my models, you can use:
+* [My Discord server](https://discord.gg/AJwZuu7Ncx) - requires Discord.
+* [The Community tab](https://huggingface.co/invisietch/MiS-Firefly-v0.1-22B/discussions) - requires HF login.
+* Discord DMs to **invisietch**.
+Your feedback is how I improve these models for future versions.
+# Disclaimer
+This model is extensively uncensored. It can generate explicit, disturbing or offensive responses. Use responsibly. I am not responsible for
+your use of this model.
+This model is a finetune of Mistral Small 22B (2409) and usage must follow the terms of Mistral's license. By downloading this model, you
+agree not to use it for commercial purposes unless you have a valid Mistral commercial license. See [the base model card](https://huggingface.co/mistralai/Mistral-Small-Instruct-2409)
+for more details.
+# Prompting Format
+I'd recommend Mistral v2v3 prompting format:
 ```
+<s>[INST] User message here.[/INST] Bot response here</s>[INST] User message 2 here.
+```
+# Sampler Settings
+I'm running the following sampler settings but this is an RC and they may not be optimal.
+- **Temperature:** Dynamic 0.7-1.1
+- **Min-P:** 0.07
+- **Rep Pen:** 1.08
+- **Rep Pen Range:** 1536
+- **XTC:** 0.1/0.15
+If you get completely incoherent responses, feel free to use these as a starting point.
+# Training Strategy
+I started with a finetune of Mistral Small 22B which had been trained on the Gutenberg dataset: [nbeerbower/Mistral-Small-Gutenberg-Doppel-22B](https://huggingface.co/nbeerbower/Mistral-Small-Gutenberg-Doppel-22B).
+The first stage of my training was a single epoch at low LR over a 474 million token text completion dataset.
+I followed this up with a coherence, decensorship & roleplay finetune over a 172 million token instruct dataset over two epochs.
+Total training time was about 32hrs on 4x Nvidia A100 80GB.
+<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>