|
--- |
|
datasets: |
|
- https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/ |
|
--- |
|
|
|
| Generated | Real (for comparison) | |
|
| ----- | --------- | |
|
|  |  | |
|
|
|
This GAN model is trained on the [FGVC Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/) dataset. The model uses [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) with [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf). |
|
|
|
The work builds up on https://huggingface.co/PrakhAI/AIPlane and https://huggingface.co/PrakhAI/AIPlane2. |
|
|
|
This model was trained to generate 256x256 images of Aircrafts. The implementation in JAX on Colab can be found [here](https://colab.research.google.com/github/prakharbanga/AIPlane3/blob/main/AIPlane3_ProGAN_%2B_Spectral_Norm_(256x256).ipynb). |
|
|
|
# Convolutional Architecture |
|
|
|
A significant improvement over https://huggingface.co/PrakhAI/AIPlane2 is the elimination of "checkerboard" artifacts. This is done by using Image Resize followed by Convolution layer in the Generator instead of a Transposed Convolution where the kernel size is not divisible by the stride. |
|
|
|
| Transposed Convolution (kernel size not divisible by stride) | Resize followed by convolution | |
|
| - | - | |
|
|  |  | |
|
|
|
# 'Good' Generated Samples |
|
|
|
 |
|
|
|
# ProGAN |
|
|
|
Progressive Growing of GANs was proposed in [Progressive Growing of GANs for improved Quality, Stability, and Variation](https://arxiv.org/pdf/1710.10196.pdf) |
|
|
|
The idea is to start learning at lower resolutions, and growing the resolution of the GAN over time. This improves both: |
|
|
|
- Training Speed: At lower resolutions, the Generator and Discriminator have fewer layers and fewer parameters. |
|
- Convergence Speed: It is much easier to learn high-level details followed by higher granularity features, compared to learning both at the same time. |
|
|
|
 |
|
|
|
# Spectral Normalization |
|
|
|
Spectral Normalization for GANs was first suggested in [Spectral Normalization for Generative Adversarial Networks](https://arxiv.org/pdf/1802.05957.pdf). |
|
|
|
Spectral Normalization constrains the Gradient Norm of the Discriminator with respect to the input, yielding a much smoother loss landscape for the Generator to navigate through. |
|
|
|
 |
|
|
|
# Latent Space Interpolation |
|
|
|
Latent Space Interpolation can be an educational exercise to get deeper insight into the model. |
|
|
|
It is observed below that several aspects of the generated image, such as the color of the sky, the grounded-ness of the plane, and the plane shape and color, are frequently continuous through the latent space. |
|
|
|
 |
|
|
|
# Training Progression |
|
|
|
Unfortunately after uploading, the first few seconds of the video are frozen. The full high resolution video is in model files. |
|
|
|
<video controls src="https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/xwHwDXm6nOF1yzYJdbIkE.mp4"></video> |
|
|
|
# Demo |
|
|
|
The demo app for this model is at https://huggingface.co/spaces/PrakhAI/AIPlane3 (please "Restart this Space" if prompted). |