--- library_name: transformers language: - en license: apache-2.0 metrics: - rouge datasets: - pszemraj/t2t-re_pretrain-small base_model: - answerdotai/ModernBERT-large --- # ModernBERT2gpt2-700m baseline EncoderDecoder created from modernBERT-large and random-init `gpt2` trained on the pszemraj/t2t-re_pretrain-small dataset for one epoch as a "baseline". - input context length 2048 - output context length 512 - single tokenizer, slightly modified from modernBERT Logs and training script can be found [on wandb](https://wandb.ai/pszemraj/enc-dec-modernbert-olmo/runs/xpg9wjco) --- It achieves the following results on the evaluation set: - Loss: 2.2113 - Rouge1: 48.6654 - Rouge2: 31.8667 - Rougel: 44.9897 - Rougelsum: 45.4126 - Gen Len: 30.24 - Num Input Tokens Seen: 524625736