-
Partially Rewriting a Transformer in Natural Language
Paper • 2501.18838 • Published • 1 -
AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders
Paper • 2501.17148 • Published • 1 -
Sparse Autoencoders Trained on the Same Data Learn Different Features
Paper • 2501.16615 • Published • 1 -
Open Problems in Mechanistic Interpretability
Paper • 2501.16496 • Published • 16
Collections
Discover the best community collections!
Collections including paper arxiv:2405.12250
-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 43 -
Qwen Technical Report
Paper • 2309.16609 • Published • 35 -
GPT-4 Technical Report
Paper • 2303.08774 • Published • 5 -
Gemini: A Family of Highly Capable Multimodal Models
Paper • 2312.11805 • Published • 44
-
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83 -
Small-scale proxies for large-scale Transformer training instabilities
Paper • 2309.14322 • Published • 20 -
Evaluating Cognitive Maps and Planning in Large Language Models with CogEval
Paper • 2309.15129 • Published • 6 -
Vision Transformers Need Registers
Paper • 2309.16588 • Published • 78