The model has tendency to hallucinate for long time

by Cleanable - opened Sep 18

Sep 18

Hello,
I am using the fp16 under koboldcpp.
Sometimes, after answering a question, it starts printing codes that has nothing to do with the conversation and it does it for very long time.
What could be wrong please?
Thank you

csabakecskemeti

DevQuasar org Sep 29

•

edited Oct 2

I have tested this model intensely. The quantized version is made with llama.cpp not sure the the inference code can cause something like that. I’d try it with llama.cpp and maybe look the Q8 and Q6 versions too

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment