LLM Architecture - a JM-Brun Collection

JM-Brun 's Collections

LLM-KG

LLM Architecture

Interpretability XAI

LLM Architecture

updated 2 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 24 days ago • 273
Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published 7 days ago • 19
FastKV: KV Cache Compression for Fast Long-Context Processing with Token-Selective Propagation

Paper • 2502.01068 • Published 4 days ago • 14
Scaling Embedding Layers in Language Models

Paper • 2502.01637 • Published 4 days ago • 16