Update README.md
Browse files
README.md
CHANGED
@@ -7,12 +7,11 @@ datasets:
|
|
7 |
|
8 |
# SparseModernBERT α=1.5 Model Card
|
9 |
|
10 |
-
Models from AdaSplash. Check the original codebase [here](https://github.com/deep-spin/SparseModernBERT).
|
11 |
-
|
12 |
-
|
13 |
## Model Overview
|
14 |
|
15 |
-
SparseModernBERT
|
|
|
|
|
16 |
|
17 |
**Key features:**
|
18 |
|
|
|
7 |
|
8 |
# SparseModernBERT α=1.5 Model Card
|
9 |
|
|
|
|
|
|
|
10 |
## Model Overview
|
11 |
|
12 |
+
SparseModernBERT-alpha1.5 is a masked language model based on [ModernBERT](https://github.com/AnswerDotAI/ModernBERT) that replaces the standard softmax attention with an adaptive sparse attention mechanism (AdaSplash) using Triton.
|
13 |
+
|
14 |
+
The sparsity parameter α = 1.5 yields moderately sparse attention patterns, improving efficiency while maintaining performance.
|
15 |
|
16 |
**Key features:**
|
17 |
|