Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Paper
β’
2407.13623
β’
Published
β’
56
Increase your vocabulary size when you scale up your language model
Predict optimal vocabulary size based on model parameters