DavidAU
/

Qwen2.5-MOE-2X7B-DeepSeek-Abliterated-Censored-19B-gguf

Model card Files Files and versions Community

DavidAU commited on 6 days ago

Commit

4184c51

·

verified ·

1 Parent(s): f1fd9e7

Update README.md

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -41,7 +41,9 @@ creating a 19B model with the "Abliterated" (Uncensored) version of Deepseek Qwe
 The model is just over 19B because of the unqiue "shared expert" (roughly 2.5 models here) used in Qwen MOEs.
-The oddball configuration yields interesting "thinking/reasoning" which is stronger than either 7B model on its own.
 Five example generations at the bottom of this page.
@@ -112,6 +114,17 @@ SOFTWARE patch (by me) for Silly Tavern (front end to connect to multiple AI app
 ---
 <h2>Example Generation:</h2>
 IQ4XS Quant, Temp 1.5, rep pen 1.06, topp: .95, minp: .05, topk: 40

 The model is just over 19B because of the unqiue "shared expert" (roughly 2.5 models here) used in Qwen MOEs.
+This "oddball" configuration yields interesting "thinking/reasoning" which is stronger than either 7B model on its own.
+And you can use any temp settings you want (rather than a narrow range of .4 to .8), and the model will still "think/reason".
 Five example generations at the bottom of this page.
 ---
+Known Issues:
+---
+From time to time model will generate some Chinese symbols/characters, especially at higher temps. This is normal
+for DeepSeek Distill models.
+Reasoning/Thinking may be a little "odd" at temps 1.5+ ; you may need to regen to get a better response.
+---
 <h2>Example Generation:</h2>
 IQ4XS Quant, Temp 1.5, rep pen 1.06, topp: .95, minp: .05, topk: 40