Error with built-in Web UI
Using the recommended command line that includes --jinja
and issuing 2nd user input in built-in Web UI results in the long error message that starts with "You have passed a message containing <|channel|> tags in the content field. Instead of doing this, you should pass analysis..."
I have the same problem. This began after this model was updated on this repo a 7 hours ago. It worked fine before. I was getting crazy, I reset all my settings, I updated llama.cpp to the latest version, I even reset the settings to the defaults and deleted the local storage and the indexed db. The issue persisted, until I removed --jinja
and --reasoning-format none
.
This happened when I restarted my server, and on the next startup it has overwritten the previously cached model with this new version which has issues. Now, the question is, how can I download the previous version of this model, the one which worked fine?
If I don't use --jinja
and --reasoning-format none
it works, but without --jinja
it ignores the model_identity
and reasoning_effort
kwargs from --chat_template_kwargs
. Just with --jinja
but without --reasoning-format none
it also seems to work and processes the kwargs properly, but you can no longer see the reasoning in the UI and there are also some other rendering issues, part of the tags between the reasoning and the reply to the user are shown before the reply.
Later edit: I think I figured it out, I'm downloading the files from the previous commit now.
By the way, to reproduce the issue with the new version it's not enough to send a single user message to the model, the first one and the first model reply work fine. The error occurs when it generates the reply to the second user message in a session, apparently when it begins to stream the reasoning for the second reply. [I just noticed this has already been mentioned, sorry, I didn't notice it initially.]
Even later edit: confirmed, with the version from the previous commit it works fine. Not sure this is the best place to report issues, as I don't see replies here to other questions. I'll try to report this on GitHub as well tomorrow, if somebody didn't do that already.
They seem to be already aware: https://github.com/ggml-org/llama.cpp/pull/15181#issuecomment-3173392169