please Q3 or Q2 under 10GB

by ryg81 - opened 5 days ago

Discussion

ryg81

5 days ago

please Q3 or Q2 under 10GB
does this work with ollama?

wsbagnsv1

QuantStack org 5 days ago

Im not sure with ollama, lmstudio seems to have issues, for now the only thing that it works confirmed is llama.cpp

wsbagnsv1

QuantStack org 5 days ago

Also i will upload q3, no issues with that, but just as a tip, if you have system ram, you can offload parts of the model to cpu, it will still be really fast (;

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment