pszemraj
/

ModernBERT2gpt2-700m-v0.1

Text2Text Generation

encoder-decoder

Inference Endpoints

Model card Files Files and versions Community

pszemraj commited on about 16 hours ago

Commit

0186416

·

verified ·

1 Parent(s): 8488ef9

Update README.md

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -13,11 +13,15 @@ base_model:
 # ModernBERT2gpt2-700m baseline
-EncoderDecoder created from modernBERT-large and random-init `gpt2` trained on the pszemraj/t2t-re_pretrain-small dataset for one epoch as a "baseline". Logs and training script can be found [on wandb](https://wandb.ai/pszemraj/enc-dec-modernbert-olmo/runs/xpg9wjco)
 - input context length 2048
 - output context length 512
-- single tokenizer, slighly modified from modernBERT
 It achieves the following results on the evaluation set:
 - Loss: 2.2113

 # ModernBERT2gpt2-700m baseline
+EncoderDecoder created from modernBERT-large and random-init `gpt2` trained on the pszemraj/t2t-re_pretrain-small dataset for one epoch as a "baseline".
 - input context length 2048
 - output context length 512
+- single tokenizer, slightly modified from modernBERT
+Logs and training script can be found [on wandb](https://wandb.ai/pszemraj/enc-dec-modernbert-olmo/runs/xpg9wjco)
+---
 It achieves the following results on the evaluation set:
 - Loss: 2.2113