Is the tokenizer_config.json file ok
#4
by
jamesthesnake
- opened
having trouble loading this model?
The key issue seems to be that the Kevin-32B model has a vocabulary size of 152,064 but the tokenizer only has 151,646 tokens
tokenizer has special tokens that go up to ID 151668, but the model expects a vocabulary size of 152064. That's a gap of 396 tokens (152064 - 151668 - 1).