This is a checkpoint of a fine tune of BART to act as an autoencoder with fixed-size 32x64 latent space, to be used for training diffusion models. See https://arxiv.org/abs/2212.09462
trained on sentences from the c4 dataset
even though this was trained for less than in the paper and on a more diverse dataset, it's pretty good with a validation loss of 0.14, and the reconstruction is correct >90% of the time.
trained from https://github.com/bary12/latent-diffusion-for-language using the following command
python train_latent_model.py --dataset_name c4_sentences --enc_dec_model facebook/bart-base --learning_rate 1e-4 --lr_warmup_steps 1000 --train_batch_size 64 --num_encoder_latents 32 --dim_ae 64 --num_decoder_latents 32 --eval_every 10000 --num_layers 3 --wandb_name bart-roc-l2norm-test-32-64 --l2_normalize_latent
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.