This is Qwen/Qwen2.5-Coder-1.5B-Instruct finetuned on 100 system_prompt/question/answer samples from Qwen/QwQ-32B-Preview from gghfez/QwQ-LongCoT-130K-cleaned.
It is intended to be used as a draft model for QwQ.
Please use the following prompt format for the best results:
"<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant. You should think step-by-step.<|im_end|>\n<|im_start|>user\nA unit has 200 employees. Now, 40 employees need to be selected as a sample using the systematic sampling method. All employees are randomly numbered from 1 to 200 and evenly divided into 40 groups according to their numbers in order (1-5, 6-10, ..., 196-200). If the number drawn from the 5th group is 23, then the number drawn from the 10th group should be.<|im_end|>\n<|im_start|>assistant\n"
I recomend you do not change the system prompt.
Testing as a draft model for QwQ using "EXTREME Logic Test: The Mysteries of the Seven Artifacts" from @code4AI, Discover AI on YouTube https://www.youtube.com/watch?v=EWTWS1nghY0:
Model Name | Speculative Decoding | Predicted Tokens | Cached Tokens | Time per Token (ms) | Tokens per Second | Solutions Found | Time to Solution (seconds) |
---|---|---|---|---|---|---|---|
Baseline | No | 9120 | 9713 | 73 | 13.63 | 1 | 669.13 |
qwen2.5-coder-7b-instruct-q4_0.gguf | Yes | 15349 | 15942 | 78 | 12.89 | 2 | 1190.46 |
qwen2.5-coder-7b-instruct-q2_k.gguf | Yes | 9344 | 9937 | 80 | 12.44 | 1 | 750.96 |
qwen2.5-coder-1.5b-instruct-q4_0.gguf | Yes | 8882 | 9475 | 69 | 14.49 | 2 | 613.01 |
qwen2.5-coder-0.5b-instruct-q4_0.gguf | Yes | 9730 | 10323 | 71 | 14.04 | 2 | 692.95 |
qwen-2.5-1.5b-finetuned-qwq-25.13.01.Q4_0.gguf | Yes | 5553 | 6146 | 65 | 15.36 | 1 | 361.52 |
System for the testing was an M1 Macbook pro 64GB. Llama.cpp version b4481.
Settings: Predictions -1 Temperature 0.7 Penalize repeat sequence 1 Consider N tokens for penalize 256 Top-K sampling 40 Top-P sampling 0.95 Min-P sampling 0.05
- Downloads last month
- 209
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.
Model tree for gbueno86/Qwen2.5-Coder-1.5B-Instruct-finetuned-qwq-Q4_0.gguf
Base model
Qwen/Qwen2.5-1.5B
Finetuned
Qwen/Qwen2.5-Coder-1.5B
Finetuned
Qwen/Qwen2.5-Coder-1.5B-Instruct