Update README.md
Browse files
README.md
CHANGED
|
@@ -3,7 +3,7 @@ base_model:
|
|
| 3 |
- deepseek-ai/DeepSeek-V3
|
| 4 |
---
|
| 5 |
This is the first 4 layers of DeepSeek-V3 with GPTQ quantization style.
|
| 6 |
-
- Layer 4's routed experts are quantized to 2-bit
|
| 7 |
- All other Linear layers are quantized to 4-bit (including MLA, dense layer ffn, and shared expert)
|
| 8 |
|
| 9 |
To load and run this model:
|
|
|
|
| 3 |
- deepseek-ai/DeepSeek-V3
|
| 4 |
---
|
| 5 |
This is the first 4 layers of DeepSeek-V3 with GPTQ quantization style.
|
| 6 |
+
- Layer 4's all routed experts (256 experts) are quantized to 2-bit
|
| 7 |
- All other Linear layers are quantized to 4-bit (including MLA, dense layer ffn, and shared expert)
|
| 8 |
|
| 9 |
To load and run this model:
|