---
datasets:
- cifar10
- https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/
---

GAN model trained on [CIFAR10 (Airplane)](https://www.tensorflow.org/datasets/catalog/cifar10) and [FGVC Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/) images. The model leverages [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) with [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf).

| Generated Images | Real Images (for comparison) |
| -------- | --------- |
| ![generated_1691259071.png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/DNio2mes1414p6cgm7K62.png) | ![image.png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/4Sp33Hl9JK2cfHzBXHXfh.png) |

# Training Progression
<video width="50%" controls>
  <source src="https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/qFlnTITZwS3DSTxLp0Oa8.mp4" type="video/mp4">
</video>

# Details
[Colab Notebook](https://colab.research.google.com/drive/1b4KFZOnLERwQW_3jQ8FMABepKEAcDIK7?usp=sharing)

The model generates 32 x 32 images of Airplanes. It is trained on an NVIDIA T4 Colab Runtime.

The Critic consists of Convolutional Layers (3x3 kernel) with strides for downsampling, and Leaky ReLU activation. The critic uses [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf), with more details [here](#spectral-normalization).

The Generator uses Transposed Convolutions (2x2 kernel) with strides for upsampling, and ReLU activation. The generator uses the variant of pixel-level Local Response Normalization proposed in the [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) paper.

# Spectral Normalization

# Progressive Growing

Progressive Growing of GAN resolutions is suggested to improve the Quality and Stability of GAN training, especially for higher resolution models (1024x1024).

For 32x32 images of Airplanes, even a short initial round of Progressive Growing clearly has an impact:

| Flat Growing (50K steps) | Progressive Growing (50K steps) |
| ----------- | ------------ |
| ![image.png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/QnTET-5ae_0x11CcXeWgR.png) | ![image.png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/lT_IwqkL60M1RSxEtJLaJ.png) |

Moreover, the additional parameter cost is very small (876.6 KB vs 855.1 KB for the generator).

# Does the model simply memorize the images?