| --- |
| datasets: |
| - cifar10 |
| - https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/ |
| --- |
| |
| GAN model trained on [CIFAR10 (Airplane)](https://www.tensorflow.org/datasets/catalog/cifar10) and [FGVC Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/) images. The model leverages [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) with [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf). |
|
|
| Try out this model [here](https://huggingface.co/spaces/PrakhAI/AIPlane). |
|
|
| | Generated Images | Real Images (for comparison) | |
| | -------- | --------- | |
| |  |  | |
|
|
| # Training Progression |
| <video width="50%" controls> |
| <source src="https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/qFlnTITZwS3DSTxLp0Oa8.mp4" type="video/mp4"> |
| </video> |
|
|
| # Details |
| [Colab Notebook](https://colab.research.google.com/drive/1b4KFZOnLERwQW_3jQ8FMABepKEAcDIK7?usp=sharing) |
|
|
| The model generates 32 x 32 images of Airplanes. It is trained on an NVIDIA T4 Colab Runtime. |
|
|
| The Critic consists of Convolutional Layers (3x3 kernel) with strides for downsampling, and Leaky ReLU activation. The critic uses [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf), with more details [here](#spectral-normalization). |
|
|
| The Generator uses Transposed Convolutions (2x2 kernel) with strides for upsampling, and ReLU activation. The generator uses the variant of pixel-level Local Response Normalization proposed in the [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) paper. |
|
|
| # Spectral Normalization |
|
|
| Spectral Normalization is a technique suggested for training GANs in [this paper](https://arxiv.org/pdf/1802.05957.pdf). |
|
|
| It aims to make the Critic's (Discriminator's) outputs mathematically continuous w.r.t. the space of input images, avoiding exploding gradients. |
|
|
| Spectral Normalization works very well in practice to stabilize the training of the GAN, as demonstrated by the example below (comparison at equivalent points during training): |
|
|
| | Batch Normalization | Spectral Normalization | |
| | ----------- | ------------ | |
| |  |  | |
|
|
| # Progressive Growing |
|
|
| Progressive Growing of GAN resolutions is suggested to improve the Quality and Stability of GAN training, especially for higher resolution models (1024x1024). |
|
|
| For 32x32 images of Airplanes, even a short initial round of Progressive Growing provides significant improvement (comparison at equivalent points during training): |
|
|
| | Flat Growing | Progressive Growing | |
| | ----------- | ------------ | |
| |  |  | |
|
|
| The generator for this model generates 4x4, 8x8, 16x16 and 32x32 images, which form the inputs for the critic. Each resolution is associated with a 'weight' (α<sub>4</sub>, α<sub>8</sub>, α<sub>16</sub>, α<sub>32</sub>), which indicate the focus on the corresponding image resolution at any given time during the training. |
|
|
| At the beginning of the training, α<sub>4</sub>=1, α<sub>8</sub>=0, α<sub>16</sub>=0, α<sub>32</sub>=0, with the values being α<sub>4</sub>=0, α<sub>8</sub>=0, α<sub>16</sub>=0, α<sub>32</sub>=1 towards the end. |