-
TRAMS: Training-free Memory Selection for Long-range Language Modeling
Paper • 2310.15494 • Published • 2 -
A Long Way to Go: Investigating Length Correlations in RLHF
Paper • 2310.03716 • Published • 10 -
YaRN: Efficient Context Window Extension of Large Language Models
Paper • 2309.00071 • Published • 68 -
Giraffe: Adventures in Expanding Context Lengths in LLMs
Paper • 2308.10882 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2309.12307
-
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 88 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 85 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83
-
Efficient Streaming Language Models with Attention Sinks
Paper • 2309.17453 • Published • 13 -
Effective Long-Context Scaling of Foundation Models
Paper • 2309.16039 • Published • 30 -
allenai/longformer-base-4096
Updated • 5.88M • 189 -
google/bigbird-roberta-base
Updated • 37.5k • 52
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 88 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 244 -
Efficient Memory Management for Large Language Model Serving with PagedAttention
Paper • 2309.06180 • Published • 25