Banned tokens list
Loving the banned tokens list, literally created an account just so I could comment about an Issue I'm having though. I'm not sure if I just wasn't perceptive enough, but on the 2 latest versions of it, the words "of" and "or" are sometimes missing or swapped, making some sentences goofy. I swear on the first version (what I assume was the first version) this didn't happen. I've been using AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS-v2.i1-Q6_K and haven't tried many others recently, so maybe it's an llm issue. All of this on koboldccp ofcourse. And I did try deleting all banned tokens and generating a few messages, which all seemed fine with "of" and "or" being used correctly/not missing.
Edit: yeah, looking through the list and a message I just generated, the... "mix of" or "mix" series of banned tokens seems to be messed up or something, since the message I generated had that prose in it, but it was just "a mix" with no "of". Aswell as "a hint of" but the "of" was cut off there aswell... and that one isn't even on the list. I swear this wasn't happening before, but maybe I just wasn't perceptive enough.
Here's a chunk of the message:
"a mix emotions swirling in her chest - vulnerability, hope, and a strange, unexplainable connection to this intriguing stranger."
"to catch a hint her soft, sweet scent."
Looking at 1st chunk of prose honestly urks me. 25 tokens of nothing. Blech.
Edit: "has drawn a lot attention"
"making me feel all kinds or things."
"the back on their legs hit the"
how?? Apparently "on" also has an issue? I cri.
I wonder if there are too many instances of "of" in the list. I have no clue how this stuff works though, so.
Edit: tried deleting alot of instances of "of" and it still happened, so I dunno.
Edit: Forget what I said below, try the new version. I think I figured out the problem.
I copied the list from here to my setup and it started happening too.
Reviewing the commits, I pushed "waves of arousal
without the double quotes, instead of "waves of arousal"
So "waves
, of
and arousal
were banned.
Thanks for reporting this, it wasn't like that on my own settings, so I don't know if I would have even noticed it.
If you are on KoboldCPP, and updated, the instances of of
can't be the problem, because there is no ban of
alone or as the start of a sentence.
You removed the warning at the beginning of the file? Everything over the ---
I mean. That would ban the of
token.
In this sentence, for example: to catch a hint her soft, sweet scent.
, there is no ban of a hint of
or of of her
.
You could create an empty character and send something like Generate for me 20 phrases with "of"
to see if something is blocking the model from using it.
Another test would be removing my list completely and keep the session going to see if it isn't a sampler, mainly an anti-repetition one, that is too aggressive.
Are you using mradermacher's quants? Can you share your samplers so I can try to replicate it?
In the meanwhile, you could try an older version to see if it really is an issue with some update. HuggingFace keeps a history of the previous versions of every file:
https://huggingface.co/Sukino/SillyTavern-Settings-and-Presets/commit/a4d6ed0192598b078d206c0ff061d2ca0b29cc3b
Yep, the missing quotation was it. I'm seeing many instances of "of" now, correctly used and whatnot. Thanks for taking a look and fixing it!