please Q3 or Q2 under 10GB

#1
by ryg81 - opened

please Q3 or Q2 under 10GB
does this work with ollama?

QuantStack org

Im not sure with ollama, lmstudio seems to have issues, for now the only thing that it works confirmed is llama.cpp

QuantStack org

Also i will upload q3, no issues with that, but just as a tip, if you have system ram, you can offload parts of the model to cpu, it will still be really fast (;

Sign up or log in to comment