DavidAU
/

Qwen2.5-MOE-6x1.5B-DeepSeek-Reasoning-e32-8.71B-gguf

Model card Files Files and versions Community

DavidAU commited on 8 days ago

Commit

7d71d3b

·

verified ·

1 Parent(s): 31ee94e

Update README.md

Files changed (1) hide show

README.md +8 -2

README.md CHANGED Viewed

@@ -31,7 +31,9 @@ pipeline_tag: text-generation
 <H2>Qwen2.5-MOE-6x1.5B-DeepSeek-Reasoning-e32-gguf</H2>
-This is a highly experimental Qwen2.5 MOE (Mixture of Experts) model comprised of SIX Qwen 2.5 1.5B models.
 It includes the following models:
@@ -55,7 +57,7 @@ if your AI/LLM app can not access the "Jinja Template".
 In Lmstudio the "Jinja Template" should load by default.
-In other apps - use the Deepseek Tokenizer.
 Sometimes this model will output/think in Chinese Characters/Symbols (with an English prompt) - regen to clear.
@@ -82,6 +84,10 @@ Depending on your prompt change temp SLOWLY: IE: .41,.42,.43 ... etc etc.
 Likewise, because these are small models, it may do a tonne of "thinking"/"reasoning" and then "forget" to finish a / the task(s). In
 this case, prompt the model to "Complete the task XYZ with the 'reasoning plan' above" .
 Also set context limit at 4k minimum, 8K+ suggested.

 <H2>Qwen2.5-MOE-6x1.5B-DeepSeek-Reasoning-e32-gguf</H2>
+This is a highly experimental Qwen2.5 MOE (Mixture of Experts) model comprised of SIX Qwen 2.5 1.5B models creating a 8.71B model.
+This model can be used for all use cases, and is also (mostly) uncensored.
 It includes the following models:
 In Lmstudio the "Jinja Template" should load by default.
+In other apps - use the Deepseek Tokenizer and/or "Jinja Template".
 Sometimes this model will output/think in Chinese Characters/Symbols (with an English prompt) - regen to clear.
 Likewise, because these are small models, it may do a tonne of "thinking"/"reasoning" and then "forget" to finish a / the task(s). In
 this case, prompt the model to "Complete the task XYZ with the 'reasoning plan' above" .
+Likewise it may function better if you breakdown the reasoning/thinking task(s) into smaller pieces :
+"IE: Instead of asking for 6 plots FOR theme XYZ, ASK IT for ONE plot for theme XYZ at a time".
 Also set context limit at 4k minimum, 8K+ suggested.