File size: 3,859 Bytes
bde8781 b63afd5 bde8781 f332b85 bde8781 f332b85 b63afd5 bde8781 b63afd5 f332b85 143fb72 f332b85 b63afd5 f332b85 b63afd5 f332b85 45ea774 ecbd3ff 45ea774 ecbd3ff 45ea774 92221c0 143fb72 ecbd3ff f332b85 ecbd3ff 92221c0 e0446ba f332b85 bde8781 2ad03f9 ecbd3ff 09a0ad1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
---
datasets:
- https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/
---
| Generated | Real (for comparison) |
| ----- | --------- |
|  |  |
This GAN model is trained on the [FGVC Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/) dataset. The model uses [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) with [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf).
The work builds up on https://huggingface.co/PrakhAI/AIPlane and https://huggingface.co/PrakhAI/AIPlane2.
This model was trained to generate 256x256 images of Aircrafts. The implementation in JAX on Colab can be found [here](https://colab.research.google.com/github/prakharbanga/AIPlane3/blob/main/AIPlane3_ProGAN_%2B_Spectral_Norm_(256x256).ipynb).
# Convolutional Architecture
A significant improvement over https://huggingface.co/PrakhAI/AIPlane2 is the elimination of "checkerboard" artifacts. This is done by using Image Resize followed by Convolution layer in the Generator instead of a Transposed Convolution where the kernel size is not divisible by the stride.
| Transposed Convolution (kernel size not divisible by stride) | Resize followed by convolution |
| - | - |
|  |  |
# 'Good' Generated Samples

# ProGAN
Progressive Growing of GANs was proposed in [Progressive Growing of GANs for improved Quality, Stability, and Variation](https://arxiv.org/pdf/1710.10196.pdf)
The idea is to start learning at lower resolutions, and growing the resolution of the GAN over time. This improves both:
- Training Speed: At lower resolutions, the Generator and Discriminator have fewer layers and fewer parameters.
- Convergence Speed: It is much easier to learn high-level details followed by higher granularity features, compared to learning both at the same time.

# Spectral Normalization
Spectral Normalization for GANs was first suggested in [Spectral Normalization for Generative Adversarial Networks](https://arxiv.org/pdf/1802.05957.pdf).
Spectral Normalization constrains the Gradient Norm of the Discriminator with respect to the input, yielding a much smoother loss landscape for the Generator to navigate through.

# Latent Space Interpolation
Latent Space Interpolation can be an educational exercise to get deeper insight into the model.
It is observed below that several aspects of the generated image, such as the color of the sky, the grounded-ness of the plane, and the plane shape and color, are frequently continuous through the latent space.

# Training Progression
Unfortunately after uploading, the first few seconds of the video are frozen. The full high resolution video is in model files.
<video controls src="https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/xwHwDXm6nOF1yzYJdbIkE.mp4"></video>
# Demo
The demo app for this model is at https://huggingface.co/spaces/PrakhAI/AIPlane3 (please "Restart this Space" if prompted). |