原始模型:https://huggingface.co/SakuraLLM/Sakura-13B-Qwen2beta-v0.9

4Bit AWQ量化,未测试,不建议使用。

GroupSize=64

适用于Kaggle双卡推理。

Downloads last month
2
Safetensors
Model size
3.36B params
Tensor type
I32
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support