|
--- |
|
datasets: |
|
- cifar10 |
|
- https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/ |
|
--- |
|
|
|
GAN model trained on [CIFAR10 (Airplane)](https://www.tensorflow.org/datasets/catalog/cifar10) and [FGVC Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/) images. The model leverages [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) with [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf). |
|
|
|
Try out this model [here](https://huggingface.co/spaces/PrakhAI/AIPlane). |
|
|
|
| Generated Images | Real Images (for comparison) | |
|
| -------- | --------- | |
|
|  |  | |
|
|
|
# Training Progression |
|
<video width="50%" controls> |
|
<source src="https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/qFlnTITZwS3DSTxLp0Oa8.mp4" type="video/mp4"> |
|
</video> |
|
|
|
# Details |
|
[Colab Notebook](https://colab.research.google.com/drive/1b4KFZOnLERwQW_3jQ8FMABepKEAcDIK7?usp=sharing) |
|
|
|
The model generates 32 x 32 images of Airplanes. It is trained on an NVIDIA T4 Colab Runtime. |
|
|
|
The Critic consists of Convolutional Layers (3x3 kernel) with strides for downsampling, and Leaky ReLU activation. The critic uses [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf), with more details [here](#spectral-normalization). |
|
|
|
The Generator uses Transposed Convolutions (2x2 kernel) with strides for upsampling, and ReLU activation. The generator uses the variant of pixel-level Local Response Normalization proposed in the [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) paper. |
|
|
|
# Spectral Normalization |
|
|
|
Spectral Normalization is a technique suggested for training GANs in [this paper](https://arxiv.org/pdf/1802.05957.pdf). |
|
|
|
It aims to make the Critic's (Discriminator's) outputs mathematically continuous w.r.t. the space of input images, avoiding exploding gradients. |
|
|
|
Spectral Normalization works very well in practice to stabilize the training of the GAN, as demonstrated by the example below (comparison at equivalent points during training): |
|
|
|
| Batch Normalization | Spectral Normalization | |
|
| ----------- | ------------ | |
|
|  |  | |
|
|
|
# Progressive Growing |
|
|
|
Progressive Growing of GAN resolutions is suggested to improve the Quality and Stability of GAN training, especially for higher resolution models (1024x1024). |
|
|
|
For 32x32 images of Airplanes, even a short initial round of Progressive Growing provides significant improvement (comparison at equivalent points during training): |
|
|
|
| Flat Growing | Progressive Growing | |
|
| ----------- | ------------ | |
|
|  |  | |
|
|
|
The generator for this model generates 4x4, 8x8, 16x16 and 32x32 images, which form the inputs for the critic. Each resolution is associated with a 'weight' (α<sub>4</sub>, α<sub>8</sub>, α<sub>16</sub>, α<sub>32</sub>), which indicate the focus on the corresponding image resolution at any given time during the training. |
|
|
|
At the beginning of the training, α<sub>4</sub>=1, α<sub>8</sub>=0, α<sub>16</sub>=0, α<sub>32</sub>=0, with the values being α<sub>4</sub>=0, α<sub>8</sub>=0, α<sub>16</sub>=0, α<sub>32</sub>=1 towards the end. |