https://github.com/cptanalatriste/birds-of-british-empire
Generating images of "fantastic" birds, using Generative Adversarial Networks (GANs).
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.8%) to scientific vocabulary
Keywords
generative-adversarial-network
Last synced: 9 months ago
·
JSON representation
Repository
Generating images of "fantastic" birds, using Generative Adversarial Networks (GANs).
Basic Info
- Host: GitHub
- Owner: cptanalatriste
- License: mit
- Language: Python
- Default Branch: master
- Homepage: https://thoughtworksarts.io/projects/birds-of-the-british-empire/
- Size: 41.6 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of taoxugit/AttnGAN
Topics
generative-adversarial-network
Created over 5 years ago
· Last pushed over 3 years ago
https://github.com/cptanalatriste/birds-of-british-empire/blob/master/
# Birds of the British Empire Pytorch implementation for reproducing AttnGAN results in the paper [AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks](http://openaccess.thecvf.com/content_cvpr_2018/papers/Xu_AttnGAN_Fine-Grained_Text_CVPR_2018_paper.pdf) by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research).### Dependencies python 3.5 Pytorch In addition, please add the project folder to PYTHONPATH and `pip install` the following packages: - `python-dateutil` - `easydict` - `pandas` - `torchfile` - `nltk` - `scikit-image` **Data** 1. Download our preprocessed metadata for [birds](https://drive.google.com/open?id=1O_LtUP9sch09QH3s_EBAgLEctBQ5JBSJ) [coco](https://drive.google.com/open?id=1rSnbIGNDGZeHlsUlLdahj0RJ9oo6lgH9) and save them to `data/` 2. Download the [birds](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html) image data. Extract them to `data/birds/` 3. Download [coco](http://cocodataset.org/#download) dataset and extract the images to `data/coco/` **Training** - Pre-train DAMSM models: - For bird dataset: `python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0` - For coco dataset: `python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 1` - Train AttnGAN models: - For bird dataset: `python main.py --cfg cfg/bird_attn2.yml --gpu 2` - For coco dataset: `python main.py --cfg cfg/coco_attn2.yml --gpu 3` - `*.yml` files are example configuration files for training/evaluation our models. **Pretrained Model** - [DAMSM for bird](https://drive.google.com/open?id=1GNUKjVeyWYBJ8hEU-yrfYQpDOkxEyP3V). Download and save it to `DAMSMencoders/` - [DAMSM for coco](https://drive.google.com/open?id=1zIrXCE9F6yfbEJIbNP5-YrEe2pZcPSGJ). Download and save it to `DAMSMencoders/` - [AttnGAN for bird](https://drive.google.com/open?id=1lqNG75suOuR_8gjoEPYNp8VyT_ufPPig). Download and save it to `models/` - [AttnGAN for coco](https://drive.google.com/open?id=1i9Xkg9nU74RAvkcqKE-rJYhjvzKAMnCi). Download and save it to `models/` - [AttnDCGAN for bird](https://drive.google.com/open?id=19TG0JUoXurxsmZLaJ82Yo6O0UJ6aDBpg). Download and save it to `models/` - This is an variant of AttnGAN which applies the propsoed attention mechanisms to DCGAN framework. **Sampling** - Run `python main.py --cfg cfg/eval_bird.yml --gpu 1` to generate examples from captions in files listed in `./data/birds/example_filenames.txt`. Results are saved to `DAMSMencoders/`. - For sampling, be sure to set `TRAIN.FLAG` and `B_VALIDATION` to `False`. In case of executing the model on a CPU, set `--gpu` parameter to a negative value. The file `example_filenames.txt` should contain a list of files, where each file has a one caption per line. After execution, `AttnGAN` will generate 3 image files (with different qualities) and 2 attention maps. - Change the `eval_*.yml` files to generate images from other pre-trained models. - Input your own sentence in "./data/birds/example_captions.txt" if you wannt to generate images from customized sentences. **Validation** - To generate images for all captions in the validation dataset, change B_VALIDATION to True in the eval_*.yml. and then run `python main.py --cfg cfg/eval_bird.yml --gpu 1` - We compute inception score for models trained on birds using [StackGAN-inception-model](https://github.com/hanzhanggit/StackGAN-inception-model). - We compute inception score for models trained on coco using [improved-gan/inception_score](https://github.com/openai/improved-gan/tree/master/inception_score). **Examples generated by AttnGAN [[Blog]](https://blogs.microsoft.com/ai/drawing-ai/)** bird example | coco example :-------------------------:|:-------------------------:  |  ### Creating an API [Evaluation code](eval) embedded into a callable containerized API is included in the `eval\` folder. ### Using InterfaceGAN to customize bird generation For a given bird attribute in `attributes.txt` , using [InterfaceGAN](https://arxiv.org/abs/2005.09635) we can obtain a direction for latent code manipulation, in order to make it more positive/negative for such attribute. To obtain the direction as numpy array, `InterfaceGAN` needs a set of latent codes and their corresponding attribute values. The following files support that process: * `batch_generate_birds.py` generates bird images using random latent codes. The latent codes are stored in `noise_vectors_array.npy` and image information, including file location, is saved in the `metadata_file.csv` file. * `organize_image_folder.py` will organise images in the [Caltech-UCSD Birds](http://www.vision.caltech.edu/visipedia/CUB-200.html) into train and validation folders for an specific attribute from `attributes.txt`. This is needed for training a feature predictor for that attribute. * `train_feature_predictor.py` will train a transfer-learning based feature predictor, using the folder organised via `organize_image_folder.py` as data input. Model state will be stored in the `feature_predictor.pt` file. * `batch_predict_feature.py` will predict the value of a feature using the model trained with `train_feature_predictor.py`, over images generated using the `noise_vectors_array.npy` latent codes. Features values will be stored in the `predictions.npy` numpy array. We can later feed `noise_vectors_array.npy` and `predictions.npy` to the `train_boundary.py` module of `InterfaceGAN` to obtain the direction for attribute manipulation. Once we have the boundary as a numpy array, can use the `AttnGAN/code/main.py` file for image generation and interpolation. Use the `attnganw/config.py` to configure the interpolation parameters. ### Citing AttnGAN If you find AttnGAN useful in your research, please consider citing: ``` @article{Tao18attngan, author = {Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He}, title = {AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks}, Year = {2018}, booktitle = {{CVPR}} } ``` **Reference** - [StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks](https://arxiv.org/abs/1710.10916) [[code]](https://github.com/hanzhanggit/StackGAN-v2) - [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/abs/1511.06434) [[code]](https://github.com/carpedm20/DCGAN-tensorflow)
Owner
- Name: Carlos Gavidia-Calderon
- Login: cptanalatriste
- Kind: user
- Location: London, United Kingdom
- Company: @alan-turing-institute
- Website: https://carlos.gavidia.me/
- Twitter: cptan_alatriste
- Repositories: 74
- Profile: https://github.com/cptanalatriste
Systems engineer by training, software developer by trade. Research Software Engineer at @alan-turing-institute .
### Dependencies
python 3.5
Pytorch
In addition, please add the project folder to PYTHONPATH and `pip install` the following packages:
- `python-dateutil`
- `easydict`
- `pandas`
- `torchfile`
- `nltk`
- `scikit-image`
**Data**
1. Download our preprocessed metadata for [birds](https://drive.google.com/open?id=1O_LtUP9sch09QH3s_EBAgLEctBQ5JBSJ) [coco](https://drive.google.com/open?id=1rSnbIGNDGZeHlsUlLdahj0RJ9oo6lgH9) and save them to `data/`
2. Download the [birds](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html) image data. Extract them to `data/birds/`
3. Download [coco](http://cocodataset.org/#download) dataset and extract the images to `data/coco/`
**Training**
- Pre-train DAMSM models:
- For bird dataset: `python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0`
- For coco dataset: `python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 1`
- Train AttnGAN models:
- For bird dataset: `python main.py --cfg cfg/bird_attn2.yml --gpu 2`
- For coco dataset: `python main.py --cfg cfg/coco_attn2.yml --gpu 3`
- `*.yml` files are example configuration files for training/evaluation our models.
**Pretrained Model**
- [DAMSM for bird](https://drive.google.com/open?id=1GNUKjVeyWYBJ8hEU-yrfYQpDOkxEyP3V). Download and save it to `DAMSMencoders/`
- [DAMSM for coco](https://drive.google.com/open?id=1zIrXCE9F6yfbEJIbNP5-YrEe2pZcPSGJ). Download and save it to `DAMSMencoders/`
- [AttnGAN for bird](https://drive.google.com/open?id=1lqNG75suOuR_8gjoEPYNp8VyT_ufPPig). Download and save it to `models/`
- [AttnGAN for coco](https://drive.google.com/open?id=1i9Xkg9nU74RAvkcqKE-rJYhjvzKAMnCi). Download and save it to `models/`
- [AttnDCGAN for bird](https://drive.google.com/open?id=19TG0JUoXurxsmZLaJ82Yo6O0UJ6aDBpg). Download and save it to `models/`
- This is an variant of AttnGAN which applies the propsoed attention mechanisms to DCGAN framework.
**Sampling**
- Run `python main.py --cfg cfg/eval_bird.yml --gpu 1` to generate examples from captions in files listed in
`./data/birds/example_filenames.txt`. Results are saved to `DAMSMencoders/`.
- For sampling, be sure to set `TRAIN.FLAG` and `B_VALIDATION` to `False`. In case of executing the model on a CPU,
set `--gpu` parameter to a negative value. The file `example_filenames.txt` should contain a list of files, where
each file has a one caption per line. After execution, `AttnGAN` will generate 3 image files (with different
qualities) and 2 attention maps.
- Change the `eval_*.yml` files to generate images from other pre-trained models.
- Input your own sentence in "./data/birds/example_captions.txt" if you wannt to generate images from customized sentences.
**Validation**
- To generate images for all captions in the validation dataset, change B_VALIDATION to True in the eval_*.yml. and then run `python main.py --cfg cfg/eval_bird.yml --gpu 1`
- We compute inception score for models trained on birds using [StackGAN-inception-model](https://github.com/hanzhanggit/StackGAN-inception-model).
- We compute inception score for models trained on coco using [improved-gan/inception_score](https://github.com/openai/improved-gan/tree/master/inception_score).
**Examples generated by AttnGAN [[Blog]](https://blogs.microsoft.com/ai/drawing-ai/)**
bird example | coco example
:-------------------------:|:-------------------------:
 | 
### Creating an API
[Evaluation code](eval) embedded into a callable containerized API is included in the `eval\` folder.
### Using InterfaceGAN to customize bird generation
For a given bird attribute in `attributes.txt` , using [InterfaceGAN](https://arxiv.org/abs/2005.09635) we can obtain
a direction for latent code manipulation, in order to make it more positive/negative for such attribute.
To obtain the direction as numpy array, `InterfaceGAN` needs a set of latent codes and their corresponding attribute
values. The following files support that process:
* `batch_generate_birds.py` generates bird images using random latent codes. The latent codes are stored in
`noise_vectors_array.npy` and image information, including file location, is saved in the `metadata_file.csv` file.
* `organize_image_folder.py` will organise images in the [Caltech-UCSD Birds](http://www.vision.caltech.edu/visipedia/CUB-200.html)
into train and validation folders for an specific attribute from `attributes.txt`. This is needed for training a feature
predictor for that attribute.
* `train_feature_predictor.py` will train a transfer-learning based feature predictor, using the folder organised via
`organize_image_folder.py` as data input. Model state will be stored in the `feature_predictor.pt` file.
* `batch_predict_feature.py` will predict the value of a feature using the model trained with
`train_feature_predictor.py`, over images generated using the `noise_vectors_array.npy` latent codes.
Features values will be stored in the `predictions.npy` numpy array.
We can later feed `noise_vectors_array.npy` and `predictions.npy` to the `train_boundary.py` module of `InterfaceGAN`
to obtain the direction for attribute manipulation.
Once we have the boundary as a numpy array, can use the `AttnGAN/code/main.py` file for image generation and interpolation.
Use the `attnganw/config.py` to configure the interpolation parameters.
### Citing AttnGAN
If you find AttnGAN useful in your research, please consider citing:
```
@article{Tao18attngan,
author = {Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He},
title = {AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks},
Year = {2018},
booktitle = {{CVPR}}
}
```
**Reference**
- [StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks](https://arxiv.org/abs/1710.10916) [[code]](https://github.com/hanzhanggit/StackGAN-v2)
- [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/abs/1511.06434) [[code]](https://github.com/carpedm20/DCGAN-tensorflow)