A newer version of the Gradio SDK is available:
5.20.1
title: CPEN45524W2CourseProject
emoji: π₯
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: mit
short_description: CPEN455-24W2CourseProject
sdk_version: 5.19.0
CPEN455 Project: Conditional PixelCNN++
This project is for CPEN 455 course project. The goal of this project is to implement the Conditional PixelCNN++ model and train it on the given dataset. After that, the model can both generate new images and classify the given images. we would evaluate the model based on both the generation performance and classification performance.
Project Guidelines
PixelCNN++ is a powerful generative model with tractable likelihood. It models the joint distribution of pixels over an image $x$ as the following product of conditional distributions.
where $x_i$ is a single pixel.
Given a class embedding $c$, PixelCNN++ can be extended to conditional generative tasks following:
In this case, with a trained conditional PixelCNN++, we could directly apply it to the zero-shot image classification task by:
Task: For our final project, you are required to achieve the following tasks
We will provide you with codes for an unconditional PixelCNN++. You adapt it to conditional image generation task and train it on our provided database.
You are required to complete a classification function to convert the output of conditional PixelCNN++ to the prediction labels when given a new image.
Basic tools
We TAs recommend several tools which will help you debug and monitor the training process:
1.wandb: wandb is a tool that helps you monitor the training process. You can see the loss, accuracy, and other metrics in real-time. You can also see the generated images and the model structure. You can find how to use wandb in the following link: https://docs.wandb.ai/quickstart
2.tensorboard: tensorboard is another tool that helps you monitor the training process. You can find how to use tensorboard in the following link: https://www.tensorflow.org/tensorboard/get_started
3.pdb: pdb is a python debugger. You can use it to debug your code. You can find how to use pdb in the following link: https://docs.python.org/3/library/pdb.html
4.conda: conda is a package manager. You can use it to create a virtual environment and install the required packages. You can find how to use conda in the following link: https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
Original PixelCNN++ code
we provided the code for the PixelCNN++ model. Before you run the code, you need to install the required packages by running the following command:
pip install -r requirements.txt
Please note that we guarantee that the requirements.txt file includes all the Python packages necessary to complete the final project. Therefore, please DO NOT install any third-party packages. If this results in the inability to run the submitted code later on, you may need to take responsibility. If you have any questions regarding Python packages, please contact the teaching assistant.
And then, you can run the code by running the following command:
python pcnn_train.py \
--batch_size 32 \
--sample_batch_size 32 \
--sampling_interval 50 \
--save_interval 1000 \
--dataset cpen455 \
--nr_resnet 2 \
--lr_decay 0.999995 \
--max_epochs 5000 \
--en_wandb True \
if you want to go into more details about Pixelcnn++, you can find the original paper in the following link: https://arxiv.org/abs/1701.05517
And there are some repositories that implement the PixelCNN++ model. You can find them in the following link:
1.Original PixelCNN++ repository implemented by OpenAI: https://github.com/openai/pixel-cnn
2.Pytorch implementation of PixelCNN++: https://github.com/pclucas14/pixel-cnn-pp
Evaluation
For the evaluation of model performance, we assess the quality of images generated by conditional PixelCNN++ and the accuracy of classification separately.
For classification accuracy, we evaluate using both accuracy and F1 score. You can submit your classification results through the project Hugging Face challenge page. Our system will calculate accuracy and F1 score based on your submission, and then update the leaderboard accordingly.
For assessing the quality of generated images, we provided an evaluation interface function using the FID score to gauge the quality. After the final project deadline, we will run all submitted code on our system and execute the FID evaluation function. It is essential to ensure that your code runs correctly and can reproduce the evaluation results reported in the project. Failure to do so may result in corresponding deductions.
Please DO NOT attempt to hack our test dataset in any way. We will attempt to reproduce the results for all submitted code, and any cheating discovered will result in deductions and appropriate actions taken.