tflsxyy
/

DeepSeek-V3-4bit-4layers

4-bit precision

Model card Files Files and versions

tflsxyy commited on Mar 8

Commit

3367277

·

verified ·

1 Parent(s): c738d90

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ base_model:
 - deepseek-ai/DeepSeek-V3
 ---
 This is the first 4 layers of DeepSeek-V3 with GPTQ quantization style.
-- Layer 4's routed experts are quantized to 2-bit
 - All other Linear layers are quantized to 4-bit (including MLA, dense layer ffn, and shared expert)
 To load and run this model:

 - deepseek-ai/DeepSeek-V3
 ---
 This is the first 4 layers of DeepSeek-V3 with GPTQ quantization style.
+- Layer 4's all routed experts (256 experts) are quantized to 2-bit
 - All other Linear layers are quantized to 4-bit (including MLA, dense layer ffn, and shared expert)
 To load and run this model: