Collections
Discover the best community collections!
Collections including paper arxiv:2307.08621
-
Efficient Streaming Language Models with Attention Sinks
Paper • 2309.17453 • Published • 13 -
Simple and Controllable Music Generation
Paper • 2306.05284 • Published • 149 -
FinGPT: Large Generative Models for a Small Language
Paper • 2311.05640 • Published • 32 -
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Paper • 2305.07185 • Published • 9
-
The Impact of Depth and Width on Transformer Language Model Generalization
Paper • 2310.19956 • Published • 10 -
Retentive Network: A Successor to Transformer for Large Language Models
Paper • 2307.08621 • Published • 170 -
RWKV: Reinventing RNNs for the Transformer Era
Paper • 2305.13048 • Published • 17 -
Attention Is All You Need
Paper • 1706.03762 • Published • 55
-
Detecting Pretraining Data from Large Language Models
Paper • 2310.16789 • Published • 11 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 19 -
AutoMix: Automatically Mixing Language Models
Paper • 2310.12963 • Published • 14 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper • 2310.12962 • Published • 13
-
Retentive Network: A Successor to Transformer for Large Language Models
Paper • 2307.08621 • Published • 170 -
LLM4SR: A Survey on Large Language Models for Scientific Research
Paper • 2501.04306 • Published • 35 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 86 -
On the Measure of Intelligence
Paper • 1911.01547 • Published • 3