https://github.com/albertnieto/homogan

Building a conditioned GAN that generate faces in Tensorflow 2.0 and Keras.

Keywords

artificial-intelligence computer-vision deep-learning generative-adversarial-network machine-learning neural-network

Last synced: 10 months ago · JSON representation

Repository

Building a conditioned GAN that generate faces in Tensorflow 2.0 and Keras.

Basic Info

Host: GitHub
Owner: albertnieto
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 922 MB

Statistics

Stars: 4
Watchers: 3
Forks: 1
Open Issues: 4
Releases: 0

Archived

Topics

artificial-intelligence computer-vision deep-learning generative-adversarial-network machine-learning neural-network

Created over 6 years ago · Last pushed over 2 years ago

Metadata Files

Readme Contributing License Code of conduct

HomoGAN

This is the final project of the postgraduate course of Artificial Intelligence with Deep Learning by Universitat Politcnica de Catalunya (UPC). The goal is to build a conditioned GAN that generate faces given some features. To do so, different architectures and known methods have been tested.

This project started on 10th of February of 2020, which implementation occured from 27th of February to 8th of April of 2020.

This network has been implemented with Tensorflow 2.0 and Keras. Experiments were run in Google Colab.

Installation

$ git clone https://github.com/anieto95/homogan
$ cd homogan/
$ sudo pip3 install -r requirements.txt

Running experiments

For the whole project, we saved a history of our code source for each experiment. Thus, we decided to simplify code and make it easy to change parameters and experiment with it. For this reason, we created a main framework to experiment.

In order to train the model, parameters should be set in config.json. Once parameters are set, simply run main.py. Nevertheless, older experiments can be run as well. Source can be found in src/old/ExperimentXX and documents in docs/ExperimentXX. Though parameters can't be changed, they can be tested by running src/old/ExperimentXX/main.py.

If the dataset is not placed in the indicated dataset folder in parameters, the script will automatically download it. Kaggle user and password must be set.

Networks parameters

|Parameters|Default value|Notes| |:---|:---|:---| |model|src.models.model15|Select the model used, different options can be found in src/models. By default it's selected model from Experiment 15, which offers best results.| |multilabelling|True|Select True if multilabelling is needed, False if not needed. If multilabelling is selected, number of parameters and labels must be selected in Celeba parameters.| |features|3|Number of parameters selected in multilabelling.| |IMGHEIGHT|128|Height of resized images.| |IMG_WIDTH|128|Width of resized images.|

Dataset parameters

|Parameters|Default value|Notes| |:---|:---|:---| |BUFFERSIZE|3000|Buffer size of dataset.| |BATCHSIZE|100|Batch size of dataset.| |kaggleUser|None|Fill Kaggle user in order to download Celeba dataset.| |kagglePass|None|Fill Kaggle pass in order to download Celeba dataset.| |datasetfolder|/content/celeba-dataset|Directory where the dataset will be saved.| |celebafeatures|[["Male", 1], ["Eyeglasses"], ["NoBeard"], ["Bald"]]|In order to select filters for the dataset, a list should be included as `[FILTERNAME, VALUE]. In order to selecto features for multilabelling, no value should be included[FEATURENAME]`.| |numimg_training|5000|Images to be included in the dataset for training.|

Training parameters

|Parameters|Default value|Notes| |:---|:---|:---| |latentdim|256|Latent dimension of Input.| |startepoch|0|If there is a checkpoint loaded, select starting epoch for training.| |epochs|100|Total number of epochs.| |traing|1|Number of times the generator will be trained.| |traind|1|Number of times the discriminator will be trained.|

Dataset

CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter.

For the whole project, images have been cropped and reduced to 128x128px. For the Experiment 16, images were preprocessed to delete the background.

Glossary

Generator (G). >A generative model is a model of the conditional probability of the observable X, given a target y.
Discriminator (D). >A discriminative model is a model of the conditional probability of the target Y, given an observation x
Fully Connected (FC). >Fully connected layers connect every neuron in one layer to every neuron in another layer.
Fully Convolutional (FConv). >The goal is to transform image pixels to pixel categories. Unlike the convolutional neural networks, an FCN transforms the height and width of the intermediate layer feature map back to the size of input image through the transposed convolution layer, so that the predictions have a one-to-one correspondence with input image in spatial dimension.
Dropout. >At each training stage, individual nodes are either dropped out of the net with probability 1-p or kept with probability p, so that a reduced network is left; incoming and outgoing edges to a dropped-out node are also removed.
Label Smoothing. >Label smoothing is a regularization technique for classification problems to prevent the model from predicting the labels too confidently during training and generalizing poorly.
Label Flipping. >Label flipping is a training technique where one selectively manipulates the labels in order to make the model more robust against label noise.
Batch Normalization. >Batch normalization is a technique for training very deep neural networks that standardizes the inputs to a layer for each mini-batch. This has the effect of stabilizing the learning process and dramatically reducing the number of training epochs required to train deep networks.
Spectral Normalization. > Spectral Normalization normalizes the spectral norm of the weight matrix W, where the spectral norm (W) that we use to regularize each layer is the largest singular value of W. In few words, simply replaces every weight W with W/(W).
Gaussian Noise. >Gaussian Noise is statistical noise having a probability density function equal to that of the normal distribution, which is also known as the Gaussian distribution. In other words, the values that the noise can take on are Gaussian-distributed.

Experiments

Experiment 1

First approach, architecture based on DCGAN.

Results

|Hyperparameters|Observations| |:---|:---| |Trainning size = 10.000
Trainning Epochs = 35
Batch Size = 16|* Huge model, generator with over 9M parameters in G vs 400k in the D.
* Slow trainning per epoch and high memory consumption.|

|Results GIF| |:---:| ||

Experiment 2

Change from previous models: * Wrap G and D definition in classes. * Add tensorboard loss tracing.

Results:

|Hyperparameters|Observations| |:---|:---| |Trainning size = 10.000
Trainning Epochs = 25
Batch Size = 16 ||

|Hyperparameters|Observations| |:---|:---| |Trainning size = 10.000
Trainning Epochs = 100\220\100
Batch Size = 100
Ratio of training G:D = 1:1\1:3\1:5 | * The final images are not good enough as the
ones in the previous experiments.
* The Spectral Normalization gives stability and prevents
the white background images.
* To improve results another experiment should be done
using Attention and Spectral Normalization which would give better results.|

Loss Charts:

|Ratio 1:1|Ratio 1:3|Ratio 1:5| |:---:|:---:|:---:| |||| |Discriminator Loss Fake Exp 12|Discriminator Loss Fake Exp 13|Discriminator Loss Fake Exp 14| |||| |Discriminator Loss Real Exp 12|Discriminator Loss Real Exp 13|Discriminator Loss Real Exp 14| |||| |Generator Loss Exp 12|Generator Loss Exp 13|Generator Loss Exp 14|