Transformers
GGUF
mistral

Llama.cpp's Server crashes when input is long

#3
by Mihaiii - opened

I use the Q8 version and I provided it a ~1k tokens input, but the server crases and the response is always empty string.

Something is off. Could anyone please confirm the issue?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment