What does the tokenization for fill-in-the-middle requests look like?

#5
by XeIaso - opened

I'm looking at messing around with the fill-in-the-middle support for codestral but I can't figure out what I'd do to use it. I see that there's a FIMRequest class, but I want to know what tokens I should use for it with llama.cpp.

Thanks for making these models! They're a lot of fun to use personally and professionally.

based on mistralai/mistral-common/tokens/tokenizers/sentencepiece.py#L335 and mistralai/mistral-common/tokens/tokenizers/base.py#L10 the prompt should be like

<s>[SUFFIX]suffix_code[PREFIX]prefix_code

with the bos token </s> as stopping condition. However, I also see a [MIDDLE] token which isn't used, maybe I'm forgetting something?

My expectation is that middle is used as the last token before the generated response.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment