Midnight Rose 70B v2.0.3 - GPTQ
a second attempt at quantizing a pre-trained model with AutoGPTQ, targetting sophosympatheia/Midnight-Rose-70B-v2.0.3
the base is a popular opensource model that scored highly on EQBench and was specifically designed for storytelling and roleplaying
drops the model size down from ~140->35GB
Notes
first (naive) attempt & notes on learnings here: sambarnes/Midnight-Rose-70B-v2.0.3-GPTQ-naive
this time, i used the same VMWare/open-instruct calibration dataset that i saw TheBloke used in some of his quantizations
he describes its purpose here:
GPTQ dataset: The calibration dataset used during quantisation. Using a dataset more appropriate to the model's training can improve quantisation accuracy.
Note that the GPTQ calibration dataset is not the same as the dataset used to train the model - please refer to the original model repo for details of the training dataset(s).
running on an H100 on modal.com it took ~1.5hrs to quantize, costing roughly $10
code to perform the quantization here: https://github.com/OpenRouterTeam/openrouter-runner/pull/79
Comparison to naive version
i uh couldnt really tell a difference lol granted was just a brief vibe check. decided to just ship this version on openrouter.ai
- Downloads last month
- 42
Model tree for sambarnes/Midnight-Rose-70B-v2.0.3-GPTQ
Base model
sophosympatheia/Midnight-Rose-70B-v2.0.3