https://github.com/alixunxing/t2f

T2F: text to face generation using Deep Learning

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.0%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

T2F: text to face generation using Deep Learning

Basic Info

Host: GitHub
Owner: alixunxing
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 58.9 MB

Statistics

Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Releases: 0

Fork of akanimax/T2F

Created almost 8 years ago · Last pushed about 8 years ago

https://github.com/alixunxing/T2F/blob/master/

# T2F
Text-to-Face generation using Deep Learning. This project combines two of the recent architectures  StackGAN  and  ProGAN  for synthesizing faces from textual descriptions.

The project uses  Face2Text  dataset containing of 400 images and textual captions for each of them. The data is included in the repository under `data/LFW/Face2Text/face2text_v0.1` directory.

Some Examples:


Architecture: 

The textual description is encoded into a summary vector using an LSTM network. The summary vector, i.e. Embedding (psy_t) as shown in the diagram is passed through the Conditioning Augmentation block (a single linear layer) to obtain the textual part of the latent vector (uses VAE like reparameterization technique) for the GAN as input. The second part of the latent vector is random gaussian noise. The latent vector so produced is fed to the generator part of the GAN, while the embedding is fed to the final layer of the discriminator for conditional distribution matching. The training of the GAN progresses exactly as mentioned in the ProGAN paper; i.e. layer by layer at increasing spatial resolutions. The new layer is introduced using the fade-in technique to avoid destroying previous learning.

## Running the code:
The code is present in the `implementation/` subdirectory. The implementation is done using the  PyTorch framework. So, for running this code, please install `PyTorch version 0.4.0` before continuing.

__Code organization:__ 

`configs`: contains the configuration files for training the network. (You can use any one, or create your own) 

`data_processing`: package containing data processing and loading modules 

`networks`: package contains network implementation 

`processed_annotations`: directory stores output of running `process_text_annotations.py` script 

`process_text_annotations.py`: processes the captions and stores output in `processed_annotations/` directory. (no need to run this script; the pickle file is included in the repo.) 

`train_network.py`: script for running the training the network 


__Sample configuration:__

    # All paths to different required data objects
    images_dir: "../data/LFW/lfw"
    processed_text_file: "processed_annotations/processed_text.pkl"
    log_dir: "training_runs/11/losses/"
    sample_dir: "training_runs/11/generated_samples/"
    save_dir: "training_runs/11/saved_models/"

    # Hyperparameters for the Model
    captions_length: 100
    img_dims:
      - 64
      - 64

    # LSTM hyperparameters
    embedding_size: 128
    hidden_size: 256
    num_layers: 3  # number of LSTM cells in the encoder network

    # Conditioning Augmentation hyperparameters
    ca_out_size: 178

    # Pro GAN hyperparameters
    depth: 5
    latent_size: 256
    learning_rate: 0.001
    beta_1: 0
    beta_2: 0
    eps: 0.00000001
    drift: 0.001
    n_critic: 1

    # Training hyperparameters:
    epochs:
      - 160
      - 80
      - 40
      - 20
      - 10
    
    # % of epochs for fading in the new layer
    fade_in_percentage:
      - 85
      - 85
      - 85
      - 85
      - 85

    batch_sizes:
      - 16
      - 16
      - 16
      - 16
      - 16

    num_workers: 3
    feedback_factor: 7  # number of logs generated per epoch
    checkpoint_factor: 2  # save the models after these many epochs
    use_matching_aware_discriminator: True  # use the matching aware discriminator

Use the `requirements.txt` to install all the dependencies for the project. 
    
    $ workon [your virtual environment]
    $ pip install -r requirements.txt

__Sample run:__

    $ mkdir training_runs
    $ mkdir training_runs/generated_samples training_runs/losses training_runs/saved_models
    $ train_network.py --config=configs/11.comf


## Other links:
blog: https://medium.com/@animeshsk3/t2f-text-to-face-generation-using-deep-learning-b3b6ba5a5a93 

training_time_lapse video: https://www.youtube.com/watch?v=NO_l87rPDb8 


## #TODO:
1.) Create a simple `demo.py` for running inference on the trained models 

2.) Separate the `implementation/networks/C_Pro_GAN.py` as a standalone library

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/alixunxing/t2f

Science Score: 10.0%

Repository

Basic Info

Statistics

https://github.com/alixunxing/T2F/blob/master/

Some Examples:

Architecture:

Owner

GitHub Events

Total

Last Year