Encoder-Decoder model with DeBERTa encoder

pre-trained models

  • deliciouscat/deberta-v3-base-encoder-decoder-v0.2

-> 297511524(298M) params

Data used

  • HuggingFaceFW/fineweb

  • AiHub ko-en translation corpus (English part)

  • Some papers that I kept

Training hparams

  • optimizer: AdamW, lr=3e-5, betas=(0.875, 0.997)

  • batch size: 12

-> training on denoising objective (BART), 29523 step

How to use

from transformers import AutoTokenizer, EncoderDecoderModel

model = EncoderDecoderModel.from_pretrained("deliciouscat/deberta-v3-base-encoder-decoder-v0.3")
tokenizer = AutoTokenizer.from_pretrained("deliciouscat/deberta-v3-base-encoder-decoder-v0.3")

Future work!

  • train more scientific data

  • fine-tune on keyword extraction task

Downloads last month
104
Safetensors
Model size
298M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Dataset used to train deliciouscat/deberta-v3-base-encoder-decoder-v0.3