jhu-clsp
/

ettin-checkpoints

Model card Files Files and versions

ettin-checkpoints / README.md

orionweller's picture

Update README.md

aaac719 verified 4 months ago

|

history blame contribute delete

2.07 kB

	---
	license: mit
	language:
	- en
	---

	# Ettin Checkpoints

	[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
	[![Paper](https://img.shields.io/badge/Paper-Arxiv-red)](https://arxiv.org/abs/2507.11412)
	[![Models](https://img.shields.io/badge/🤗%20Hugging%20Face-12%20Models-blue)](https://huggingface.co/jhu-clsp)
	[![GitHub](https://img.shields.io/badge/GitHub-Code-black)](https://github.com/jhu-clsp/ettin-encoder-vs-decoder)

	This repository contains the raw training checkpoints for the Ettin models. Each model contains a unique subdirectory, e.g. enc-150m for Ettin-Encoder-150m, with three subfolders for `decay`, `ext`, and `pretrain`.

	These files work with Composer and contain all state needed to resume pre-training. Please see the [ModernBERT repository](https://github.com/AnswerDotAI/ModernBERT) for usage details.


	## 🔗 Related Resources

	- Models: [Ettin Model Suite](https://huggingface.co/collections/jhu-clsp/encoders-vs-decoders-the-ettin-suite-686303e16142257eed8e6aeb) (17M-1B parameters)
	- Phase 1: [Pre-training Data](https://huggingface.co/datasets/jhu-clsp/ettin-pretraining-data) (1.7T tokens)
	- Phase 2: [Mid-training Data](https://huggingface.co/datasets/jhu-clsp/ettin-extension-data) (250B tokens)
	- Phase 3: [Decay Phase Data](https://huggingface.co/datasets/jhu-clsp/ettin-decay-data) (50B tokens)
	- Training Order: [Batch-level Data Order](https://huggingface.co/datasets/jhu-clsp/ettin-data-order)
	- Paper: [Arxiv link](https://arxiv.org/abs/2507.11412)
	- Code: [GitHub Repository](https://github.com/jhu-clsp/ettin-encoder-vs-decoder)

	## Citation

	```bibtex
	@misc{weller2025seqvsseqopen,
	title={Seq vs Seq: An Open Suite of Paired Encoders and Decoders},
	author={Orion Weller and Kathryn Ricci and Marc Marone and Antoine Chaffin and Dawn Lawrie and Benjamin Van Durme},
	year={2025},
	eprint={2507.11412},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2507.11412},
	}
	```