Update default tokenization behavior to "longest" in README
#2
by
MichaelR207
- opened
When you use the default code in the README the tokenizer is set to max_length. This causes an OOM error even on an H100. This is because each sequence is padded to 131072 tokens, the max length for Llama 3.2 3b. A much more reasonable behavior is padding the max length of a sequence. This is accomplished by switching "padding": "longest" in the tokenizer kwargs.
Thank you!
Ray2333
changed pull request status to
merged