Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2405.12250

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized

about 23 hours ago

Enhancing Automated Interpretability with Output-Centric Feature Descriptions

Paper • 2501.08319 • Published 16 days ago • 10
Open Problems in Machine Unlearning for AI Safety

Paper • 2501.04952 • Published 22 days ago • 1
Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models

Paper • 2412.16247 • Published Dec 20, 2024 • 1
Inferring Functionality of Attention Heads from their Parameters

Paper • 2412.11965 • Published Dec 16, 2024 • 2

Running on CPU Upgrade

1.65k

🏢

Anychat
Running

258

🐢

Qwen2.5 Coder Artifacts
Running

873

🔍

QwQ-32B-Preview

QwQ-32B-Preview
Running on CPU Upgrade

12.3k

🏆

Open LLM Leaderboard

Track, rank and evaluate open LLMs and chatbots

Selective Attention Improves Transformer

Paper • 2410.02703 • Published Oct 3, 2024 • 24
Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 169
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

Paper • 2410.05076 • Published Oct 7, 2024 • 8
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Paper • 2410.13276 • Published Oct 17, 2024 • 26

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 140
Elucidating the Design Space of Diffusion-Based Generative Models

Paper • 2206.00364 • Published Jun 1, 2022 • 15
GLU Variants Improve Transformer

Paper • 2002.05202 • Published Feb 12, 2020 • 2
StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 137

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31, 2024 • 64
Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19, 2024 • 151
KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30, 2024 • 109
Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography

Paper • 2501.08970 • Published 16 days ago • 6

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19, 2024 • 151
Explaining Text Similarity in Transformer Models

Paper • 2405.06604 • Published May 10, 2024

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 67
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Paper • 2406.06469 • Published Jun 10, 2024 • 25
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

Paper • 2406.04271 • Published Jun 6, 2024 • 29
Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4, 2024 • 38

Models and Linearity

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19, 2024 • 151
Not All Language Model Features Are Linear

Paper • 2405.14860 • Published May 23, 2024 • 39

lshort-transformers

Papers useful when writing the paper: "The Not So Short Transfromers"

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6, 2024 • 62
SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26, 2024 • 72
Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19, 2024 • 151
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 62

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19, 2024 • 151

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs