Query about `model_max_length` configuration
#4
by
vm7608
- opened
Hello, and thank you for this fantastic model.
I have a quick question about the configuration. I noticed that config.json
sets max_position_embeddings
to 131072
, but the tokenizer_config.json
has model_max_length
set to 16384
.
This causes a confusing situation when serving the model with vLLM. The startup log correctly shows that the engine is using the full context:INFO ... Using max model len 131072
However, the API server still throws a validation warning based on the tokenizer's setting for any input over 16k tokens:Token indices sequence length is longer than the specified maximum sequence length for this model (38682 > 16384).
Is this discrepancy intentional?
Anw, thanks again for all your hard work on this releases!