This is a great model
#1
by
FrenzyBiscuit
- opened
It works great once you start using Qwenception.
Can you do a 1.5B version for speculative decoding?
Actually, would be cool to see a 14B version as well.
Thanks! I'll probably switch gears to 1.5B (since experiments will be cheaper) then come back to 14B. :)