Bary/bart-autoencoder-c4-sentences

This is a checkpoint of a fine tune of BART to act as an autoencoder with fixed-size 32x64 latent space, to be used for training diffusion models. See https://arxiv.org/abs/2212.09462

trained on sentences from the c4 dataset

even though this was trained for less than in the paper and on a more diverse dataset, it's pretty good with a validation loss of 0.14, and the reconstruction is correct >90% of the time.

trained from https://github.com/bary12/latent-diffusion-for-language using the following command

python train_latent_model.py --dataset_name c4_sentences --enc_dec_model facebook/bart-base --learning_rate 1e-4 --lr_warmup_steps 1000 --train_batch_size 64 --num_encoder_latents 32 --dim_ae 64 --num_decoder_latents 32  --eval_every 10000 --num_layers 3 --wandb_name bart-roc-l2norm-test-32-64 --l2_normalize_latent