This is a checkpoint of a fine tune of BART to act as an autoencoder with fixed-size 32x64 latent space, to be used for training diffusion models. See https://arxiv.org/abs/2212.09462

trained on sentences from the c4 dataset

even though this was trained for less than in the paper and on a more diverse dataset, it's pretty good with a validation loss of 0.14, and the reconstruction is correct >90% of the time.

trained from https://github.com/bary12/latent-diffusion-for-language using the following command

python train_latent_model.py --dataset_name c4_sentences --enc_dec_model facebook/bart-base --learning_rate 1e-4 --lr_warmup_steps 1000 --train_batch_size 64 --num_encoder_latents 32 --dim_ae 64 --num_decoder_latents 32  --eval_every 10000 --num_layers 3 --wandb_name bart-roc-l2norm-test-32-64 --l2_normalize_latent
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.