---
library_name: transformers
language:
- en
license: apache-2.0
metrics:
- rouge
datasets:
- pszemraj/t2t-re_pretrain-small
base_model:
- answerdotai/ModernBERT-large
---

# ModernBERT2gpt2-700m baseline

EncoderDecoder created from modernBERT-large and random-init `gpt2` trained on the pszemraj/t2t-re_pretrain-small dataset for one epoch as a "baseline". 

- input context length 2048
- output context length 512
- single tokenizer, slightly modified from modernBERT

Logs and training script can be found [on wandb](https://wandb.ai/pszemraj/enc-dec-modernbert-olmo/runs/xpg9wjco)

---

It achieves the following results on the evaluation set:
- Loss: 2.2113
- Rouge1: 48.6654
- Rouge2: 31.8667
- Rougel: 44.9897
- Rougelsum: 45.4126
- Gen Len: 30.24
- Num Input Tokens Seen: 524625736