--- library_name: diffusers license: mit datasets: - uoft-cs/cifar10 - nyanko7/danbooru2023 language: - en pipeline_tag: text-to-image --- # DDPM Project This repository contains the implementation of Denoising Diffusion Probabilistic Models (DDPM). ## Table of Contents - [Introduction](#introduction) - [Installation](#installation) - [Usage](#usage) - [Contributing](#contributing) ## Introduction Denoising Diffusion Probabilistic Models (DDPM) are a class of generative models that learn to generate data by reversing a diffusion process. This repository provides a comprehensive implementation of DDPM. ## Installation To install the necessary dependencies, run: ```bash pip install -r requirements.txt ``` ## Usage To train the model, use the following command: ```bash python train.py ``` To generate samples, use: ```bash python generate.py ``` ## Game To understand the model and it's workings, we're working on a cool cute little game where the user is the UNET reverser/diffusion model and is tasked to denoise the images with noise made of grids of lines. Use [learndiffusion.vercel.app](learndiffusion.vercel.app) to access the primitive version of the game. You can also contribute to the game by checking out at the diffusion_game branch. A new model showcase will also be added such that the model's weights are loaded from the internet, model's files are installed and loaded into a gradio interface for direct use/inference on the vercel. Feel free to make changes for the same, issue is opened. ## Explanations and Mathematics - slides from presentation : - notes/explanations : [HERE](slides\notes) - a cute lab talk ppt: - plato's allegory : \ ## Resources - Original Paper : https://arxiv.org/pdf/2006.11239 - Improvement Paper : https://arxiv.org/abs/2102.09672 - Improvement by OpenAI : https://arxiv.org/pdf/2105.05233 - Stable Diffusion Paper : https://arxiv.org/abs/2112.10752 - ### Papers for background - UNET Paper for Biomedical Segmentation - Autoencooder - Variational Autoencoder - Markov Hierarchical VAE - Introductory Lectures on Diffusion Process ### Youtube videos and courses #### Mathematics - Outliers - Omar Jahil #### Pytorch Implementation - [Deep Findr](https://www.youtube.com/watch?v=a4Yfz2FxXiY) - [Notebook from Deep Findr](https://colab.research.google.com/drive/1sjy9odlSSy0RBVgMTgP7s99NXsqglsUL?usp=sharing) ## Pretrained Weights weights from the model can be found in [pretrained_weights](https://drive.google.com/drive/folders/1NiQDI3e67I9FITVnrzNPP2Az0LABRpic?usp=sharing) For loading the pretrained weights: ``` model2 = SimpleUnet() model2.load_state_dict(torch.load("/content/drive/MyDrive/Research Work/mlsa/DDPM/model_weights.pth")) model2.eval() ``` For making inferences TODO: Errors in the sampling function, boolean errors and etc. Will open issues for solving by others as exercise if needed. ``` num_samples = 8 # Number of images to generate image_size = (3, 32, 32) # Example for CIFAR10 noise = torch.randn(num_samples, *image_size).to("cuda") model2.to("cuda") # Generate images by denoising with torch.no_grad(): generated_images = model2.sample(noise) # Save the generated images save_image(generated_images, "generated_images.png", nrow=4, normalize=True) ``` ## Contributing Contributions are welcome! Please open an issue or submit a pull request. ## Future Ideas - Make the model onnx compatible for training and inferencing on Intel GPUs - Build a Stable Diffusion model Text2Img using CLIP implementationnnnn !!! - Train the current model for a much larger dataset with more generalizations and nuances