Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper β’ 2502.11089 β’ Published 9 days ago β’ 134
Running on Zero 402 402 Chat with DeepSeek-VL2-small π Generate responses using images and text input
mistralai/Mistral-Small-24B-Instruct-2501 Text Generation β’ Updated 23 days ago β’ 755k β’ β’ 817