|
--- |
|
library_name: diffusers |
|
license: mit |
|
datasets: |
|
- uoft-cs/cifar10 |
|
- nyanko7/danbooru2023 |
|
language: |
|
- en |
|
pipeline_tag: text-to-image |
|
--- |
|
# DDPM Project |
|
|
|
This repository contains the implementation of Denoising Diffusion Probabilistic Models (DDPM). |
|
|
|
## Table of Contents |
|
- [Introduction](#introduction) |
|
- [Installation](#installation) |
|
- [Usage](#usage) |
|
- [Contributing](#contributing) |
|
|
|
## Introduction |
|
Denoising Diffusion Probabilistic Models (DDPM) are a class of generative models that learn to generate data by reversing a diffusion process. This repository provides a comprehensive implementation of DDPM. |
|
|
|
## Installation |
|
To install the necessary dependencies, run: |
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
## Usage |
|
To train the model, use the following command: |
|
```bash |
|
python train.py |
|
``` |
|
To generate samples, use: |
|
```bash |
|
python generate.py |
|
``` |
|
|
|
## Game |
|
To understand the model and it's workings, we're working on a cool cute little game where the user is the UNET reverser/diffusion model and is tasked to denoise the images with noise made of grids of lines. |
|
|
|
Use [learndiffusion.vercel.app](learndiffusion.vercel.app) to access the primitive version of the game. You can also contribute to the game by checking out at the diffusion_game branch. A new model showcase will also be added such that the model's weights are loaded from the internet, model's files are installed and loaded into a gradio interface for direct use/inference on the vercel. Feel free to make changes for the same, issue is opened. |
|
|
|
## Explanations and Mathematics |
|
- slides from presentation : |
|
- notes/explanations : [HERE](slides\notes) |
|
- a cute lab talk ppt: |
|
- plato's allegory : \<link to REPUBLIC> |
|
|
|
## Resources |
|
- Original Paper : https://arxiv.org/pdf/2006.11239 |
|
- Improvement Paper : https://arxiv.org/abs/2102.09672 |
|
- Improvement by OpenAI : https://arxiv.org/pdf/2105.05233 |
|
- Stable Diffusion Paper : https://arxiv.org/abs/2112.10752 |
|
- |
|
|
|
### Papers for background |
|
- UNET Paper for Biomedical Segmentation |
|
- Autoencooder |
|
- Variational Autoencoder |
|
- Markov Hierarchical VAE |
|
- Introductory Lectures on Diffusion Process |
|
|
|
### Youtube videos and courses |
|
#### Mathematics |
|
- Outliers |
|
- Omar Jahil |
|
|
|
#### Pytorch Implementation |
|
- [Deep Findr](https://www.youtube.com/watch?v=a4Yfz2FxXiY) |
|
- [Notebook from Deep Findr](https://colab.research.google.com/drive/1sjy9odlSSy0RBVgMTgP7s99NXsqglsUL?usp=sharing) |
|
|
|
## Pretrained Weights |
|
weights from the model can be found in [pretrained_weights](https://drive.google.com/drive/folders/1NiQDI3e67I9FITVnrzNPP2Az0LABRpic?usp=sharing) |
|
|
|
For loading the pretrained weights: |
|
``` |
|
model2 = SimpleUnet() |
|
model2.load_state_dict(torch.load("/content/drive/MyDrive/Research Work/mlsa/DDPM/model_weights.pth")) |
|
model2.eval() |
|
``` |
|
|
|
For making inferences |
|
TODO: Errors in the sampling function, boolean errors and etc. Will open issues for solving by others as exercise if needed. |
|
``` |
|
num_samples = 8 # Number of images to generate |
|
image_size = (3, 32, 32) # Example for CIFAR10 |
|
noise = torch.randn(num_samples, *image_size).to("cuda") |
|
|
|
model2.to("cuda") |
|
# Generate images by denoising |
|
with torch.no_grad(): |
|
generated_images = model2.sample(noise) |
|
|
|
# Save the generated images |
|
save_image(generated_images, "generated_images.png", nrow=4, normalize=True) |
|
``` |
|
|
|
|
|
## Contributing |
|
Contributions are welcome! Please open an issue or submit a pull request. |
|
|
|
|
|
## Future Ideas |
|
- Make the model onnx compatible for training and inferencing on Intel GPUs |
|
- Build a Stable Diffusion model Text2Img using CLIP implementationnnnn !!! |
|
- Train the current model for a much larger dataset with more generalizations and nuances |