-
Uncovering mesa-optimization algorithms in Transformers
Paper • 2309.05858 • Published • 12 -
ProPainter: Improving Propagation and Transformer for Video Inpainting
Paper • 2309.03897 • Published • 27 -
Approximating Two-Layer Feedforward Networks for Efficient Transformers
Paper • 2310.10837 • Published • 11 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 10
Collections
Discover the best community collections!
Collections including paper arxiv:2406.09308