DavidAU
/

Qwen2.5-MOE-6x1.5B-DeepSeek-Reasoning-e32-8.71B-gguf

Model card Files Files and versions Community

DavidAU commited on 9 days ago

Commit

090bee8

·

verified ·

1 Parent(s): 9af5375

Update README.md

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -57,7 +57,7 @@ In Lmstudio the "Jinja Template" should load by default.
 In other apps - use the Deepseek Tokenizer.
-Sometimes this model will output Chinese Characters/Symbols (with an English prompt) - regen to clear.
 Sometimes it will work great, other times it will give "so/so" answers and then sometimes it will bat it out of the park, and past the "state line."
@@ -67,13 +67,14 @@ It is all over the map.
 Four examples below so you have some idea what this model can do.
-Keep in mind this model is six 1.5B parameters models working together, and will not have the power of say a 14B or 32B model.
-Also, this model has 4/6 experts activated.
 You may want to set 6/6 experts for best results.
-This model is also mastered in Float 32, which helped overall model generation and addressed some model issues.
 Temp of .4 to .8 is suggested, however it will still operate at much higher temps like 1.8, 2.6 etc.

 In other apps - use the Deepseek Tokenizer.
+Sometimes this model will output/think in Chinese Characters/Symbols (with an English prompt) - regen to clear.
 Sometimes it will work great, other times it will give "so/so" answers and then sometimes it will bat it out of the park, and past the "state line."
 Four examples below so you have some idea what this model can do.
+Keep in mind this model is six 1.5B parameters models working together, and will not have the power of a 14B or 32B reasoning/thinking model.
+Also, this model has 4/6 experts activated by default.
 You may want to set 6/6 experts for best results.
+This model is also mastered in Float 32, which helped overall model generation and addressed some model generation issues
+and oddly seemed to add some new ones (? - Chinese Char/Symb thinking.).
 Temp of .4 to .8 is suggested, however it will still operate at much higher temps like 1.8, 2.6 etc.