原始模型:https://huggingface.co/SakuraLLM/Sakura-13B-Qwen2beta-v0.9
4Bit AWQ量化,未测试,不建议使用。
GroupSize=64
适用于Kaggle双卡推理。
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support