File size: 3,859 Bytes
bde8781
 
 
 
 
 
 
b63afd5
bde8781
f332b85
bde8781
f332b85
 
 
 
 
 
 
 
 
 
 
 
b63afd5
bde8781
b63afd5
f332b85
143fb72
f332b85
b63afd5
 
 
 
 
 
f332b85
b63afd5
f332b85
45ea774
 
 
 
ecbd3ff
45ea774
ecbd3ff
45ea774
92221c0
143fb72
ecbd3ff
f332b85
ecbd3ff
92221c0
e0446ba
f332b85
 
bde8781
2ad03f9
 
ecbd3ff
09a0ad1
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
datasets:
- https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/
---

| Generated | Real (for comparison) |
|  ----- | --------- |
|   ![image/png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/Zfi0ew3AwXWw5e9PbnFzm.png)   |   ![image/png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/Tz5LTQUWHnJ2RQZfzaeqi.png)   |

This GAN model is trained on the [FGVC Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/) dataset. The model uses [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) with [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf).

The work builds up on https://huggingface.co/PrakhAI/AIPlane and https://huggingface.co/PrakhAI/AIPlane2.

This model was trained to generate 256x256 images of Aircrafts. The implementation in JAX on Colab can be found [here](https://colab.research.google.com/github/prakharbanga/AIPlane3/blob/main/AIPlane3_ProGAN_%2B_Spectral_Norm_(256x256).ipynb).

# Convolutional Architecture

A significant improvement over https://huggingface.co/PrakhAI/AIPlane2 is the elimination of "checkerboard" artifacts. This is done by using Image Resize followed by Convolution layer in the Generator instead of a Transposed Convolution where the kernel size is not divisible by the stride.

| Transposed Convolution (kernel size not divisible by stride) | Resize followed by convolution |
| - | - |
| ![image/png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/Vs1Dks67tteJGA2EaVMjW.png) | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/fz_Gv0UIYh_Z1GZ2TrCW1.png) |

# 'Good' Generated Samples

![image/png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/rHdCxl4Y15wHsRjb3qOUl.png)

# ProGAN

Progressive Growing of GANs was proposed in [Progressive Growing of GANs for improved Quality, Stability, and Variation](https://arxiv.org/pdf/1710.10196.pdf)

The idea is to start learning at lower resolutions, and growing the resolution of the GAN over time. This improves both:

- Training Speed: At lower resolutions, the Generator and Discriminator have fewer layers and fewer parameters.
- Convergence Speed: It is much easier to learn high-level details followed by higher granularity features, compared to learning both at the same time.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/r8HWxaSkHArS_89VCiKZi.png)

# Spectral Normalization

Spectral Normalization for GANs was first suggested in [Spectral Normalization for Generative Adversarial Networks](https://arxiv.org/pdf/1802.05957.pdf).

Spectral Normalization constrains the Gradient Norm of the Discriminator with respect to the input, yielding a much smoother loss landscape for the Generator to navigate through.

![image/gif](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/-tNPiZLIn1EtxeWjEiPKf.gif)

# Latent Space Interpolation

Latent Space Interpolation can be an educational exercise to get deeper insight into the model.

It is observed below that several aspects of the generated image, such as the color of the sky, the grounded-ness of the plane, and the plane shape and color, are frequently continuous through the latent space.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/cB_kZUI0JOWFGT4HcVGpV.png)

# Training Progression

Unfortunately after uploading, the first few seconds of the video are frozen. The full high resolution video is in model files.

<video controls src="https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/xwHwDXm6nOF1yzYJdbIkE.mp4"></video>

# Demo

The demo app for this model is at https://huggingface.co/spaces/PrakhAI/AIPlane3 (please "Restart this Space" if prompted).