-
Sparse Autoencoders Find Highly Interpretable Features in Language Models
Paper • 2309.08600 • Published • 15 -
Scaling and evaluating sparse autoencoders
Paper • 2406.04093 • Published • 3 -
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Paper • 2403.19647 • Published • 3 -
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Paper • 2408.05147 • Published • 39
Collections
Discover the best community collections!
Collections including paper arxiv:2309.08600
-
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
Paper • 2310.00535 • Published • 2 -
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Paper • 2211.00593 • Published • 2 -
Rethinking Interpretability in the Era of Large Language Models
Paper • 2402.01761 • Published • 23 -
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Paper • 2307.09458 • Published • 11
-
Sparse Autoencoders Find Highly Interpretable Features in Language Models
Paper • 2309.08600 • Published • 15 -
In-context Autoencoder for Context Compression in a Large Language Model
Paper • 2307.06945 • Published • 28 -
Self-slimmed Vision Transformer
Paper • 2111.12624 • Published • 1 -
MEMORY-VQ: Compression for Tractable Internet-Scale Memory
Paper • 2308.14903 • Published • 1
-
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Paper • 2309.16414 • Published • 19 -
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Paper • 2309.13018 • Published • 9 -
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 27 -
Language models in molecular discovery
Paper • 2309.16235 • Published • 10
-
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83 -
Baichuan 2: Open Large-scale Language Models
Paper • 2309.10305 • Published • 20 -
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 38 -
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 65