English
orionweller commited on
Commit
aaac719
·
verified ·
1 Parent(s): 4e493ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -3
README.md CHANGED
@@ -1,3 +1,41 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ ---
6
+
7
+ # Ettin Checkpoints
8
+
9
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
10
+ [![Paper](https://img.shields.io/badge/Paper-Arxiv-red)](https://arxiv.org/abs/2507.11412)
11
+ [![Models](https://img.shields.io/badge/🤗%20Hugging%20Face-12%20Models-blue)](https://huggingface.co/jhu-clsp)
12
+ [![GitHub](https://img.shields.io/badge/GitHub-Code-black)](https://github.com/jhu-clsp/ettin-encoder-vs-decoder)
13
+
14
+ This repository contains the raw training checkpoints for the Ettin models. Each model contains a unique subdirectory, e.g. enc-150m for Ettin-Encoder-150m, with three subfolders for `decay`, `ext`, and `pretrain`.
15
+
16
+ These files work with Composer and contain all state needed to resume pre-training. Please see the [ModernBERT repository](https://github.com/AnswerDotAI/ModernBERT) for usage details.
17
+
18
+
19
+ ## 🔗 Related Resources
20
+
21
+ - **Models**: [Ettin Model Suite](https://huggingface.co/collections/jhu-clsp/encoders-vs-decoders-the-ettin-suite-686303e16142257eed8e6aeb) (17M-1B parameters)
22
+ - **Phase 1**: [Pre-training Data](https://huggingface.co/datasets/jhu-clsp/ettin-pretraining-data) (1.7T tokens)
23
+ - **Phase 2**: [Mid-training Data](https://huggingface.co/datasets/jhu-clsp/ettin-extension-data) (250B tokens)
24
+ - **Phase 3**: [Decay Phase Data](https://huggingface.co/datasets/jhu-clsp/ettin-decay-data) (50B tokens)
25
+ - **Training Order**: [Batch-level Data Order](https://huggingface.co/datasets/jhu-clsp/ettin-data-order)
26
+ - **Paper**: [Arxiv link](https://arxiv.org/abs/2507.11412)
27
+ - **Code**: [GitHub Repository](https://github.com/jhu-clsp/ettin-encoder-vs-decoder)
28
+
29
+ ## Citation
30
+
31
+ ```bibtex
32
+ @misc{weller2025seqvsseqopen,
33
+ title={Seq vs Seq: An Open Suite of Paired Encoders and Decoders},
34
+ author={Orion Weller and Kathryn Ricci and Marc Marone and Antoine Chaffin and Dawn Lawrie and Benjamin Van Durme},
35
+ year={2025},
36
+ eprint={2507.11412},
37
+ archivePrefix={arXiv},
38
+ primaryClass={cs.CL},
39
+ url={https://arxiv.org/abs/2507.11412},
40
+ }
41
+ ```