Is the tokenizer_config.json file ok

by jamesthesnake - opened Jun 3

Discussion

jamesthesnake

Jun 3

having trouble loading this model?

jamesthesnake

Jun 3

The key issue seems to be that the Kevin-32B model has a vocabulary size of 152,064 but the tokenizer only has 151,646 tokens

jamesthesnake

Jun 3

tokenizer has special tokens that go up to ID 151668, but the model expects a vocabulary size of 152064. That's a gap of 396 tokens (152064 - 151668 - 1).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment