Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published 24 days ago • 67
You Do Not Fully Utilize Transformer's Representation Capacity Paper • 2502.09245 • Published 29 days ago • 34
FoNE: Precise Single-Token Number Embeddings via Fourier Features Paper • 2502.09741 • Published 28 days ago • 11
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 29 days ago • 184