https://github.com/cptanalatriste/birds-of-british-empire

Generating images of "fantastic" birds, using Generative Adversarial Networks (GANs).

https://github.com/cptanalatriste/birds-of-british-empire

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.8%) to scientific vocabulary

Keywords

generative-adversarial-network
Last synced: 9 months ago · JSON representation

Repository

Generating images of "fantastic" birds, using Generative Adversarial Networks (GANs).

Basic Info
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of taoxugit/AttnGAN
Topics
generative-adversarial-network
Created over 5 years ago · Last pushed over 3 years ago

https://github.com/cptanalatriste/birds-of-british-empire/blob/master/

# Birds of the British Empire

Pytorch implementation for reproducing AttnGAN results in the paper [AttnGAN: Fine-Grained Text to Image Generation
with Attentional Generative Adversarial Networks](http://openaccess.thecvf.com/content_cvpr_2018/papers/Xu_AttnGAN_Fine-Grained_Text_CVPR_2018_paper.pdf) by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research). 




### Dependencies
python 3.5

Pytorch

In addition, please add the project folder to PYTHONPATH and `pip install` the following packages:
- `python-dateutil`
- `easydict`
- `pandas`
- `torchfile`
- `nltk`
- `scikit-image`



**Data**

1. Download our preprocessed metadata for [birds](https://drive.google.com/open?id=1O_LtUP9sch09QH3s_EBAgLEctBQ5JBSJ) [coco](https://drive.google.com/open?id=1rSnbIGNDGZeHlsUlLdahj0RJ9oo6lgH9) and save them to `data/`
2. Download the [birds](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html) image data. Extract them to `data/birds/`
3. Download [coco](http://cocodataset.org/#download) dataset and extract the images to `data/coco/`



**Training**
- Pre-train DAMSM models:
  - For bird dataset: `python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0`
  - For coco dataset: `python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 1`
 
- Train AttnGAN models:
  - For bird dataset: `python main.py --cfg cfg/bird_attn2.yml --gpu 2`
  - For coco dataset: `python main.py --cfg cfg/coco_attn2.yml --gpu 3`

- `*.yml` files are example configuration files for training/evaluation our models.



**Pretrained Model**
- [DAMSM for bird](https://drive.google.com/open?id=1GNUKjVeyWYBJ8hEU-yrfYQpDOkxEyP3V). Download and save it to `DAMSMencoders/`
- [DAMSM for coco](https://drive.google.com/open?id=1zIrXCE9F6yfbEJIbNP5-YrEe2pZcPSGJ). Download and save it to `DAMSMencoders/`
- [AttnGAN for bird](https://drive.google.com/open?id=1lqNG75suOuR_8gjoEPYNp8VyT_ufPPig). Download and save it to `models/`
- [AttnGAN for coco](https://drive.google.com/open?id=1i9Xkg9nU74RAvkcqKE-rJYhjvzKAMnCi). Download and save it to `models/`

- [AttnDCGAN for bird](https://drive.google.com/open?id=19TG0JUoXurxsmZLaJ82Yo6O0UJ6aDBpg). Download and save it to `models/`
  - This is an variant of AttnGAN which applies the propsoed attention mechanisms to DCGAN framework. 

**Sampling**
- Run `python main.py --cfg cfg/eval_bird.yml --gpu 1` to generate examples from captions in files listed in 
`./data/birds/example_filenames.txt`. Results are saved to `DAMSMencoders/`. 
- For sampling, be sure to set `TRAIN.FLAG` and `B_VALIDATION` to `False`. In case of executing the model on a CPU,
set `--gpu` parameter to a negative value. The file `example_filenames.txt` should contain a list of files, where 
each file has a one caption per line. After execution, `AttnGAN` will generate 3 image files (with different 
qualities) and 2 attention maps.
- Change the `eval_*.yml` files to generate images from other pre-trained models. 
- Input your own sentence in "./data/birds/example_captions.txt" if you wannt to generate images from customized sentences. 

**Validation**
- To generate images for all captions in the validation dataset, change B_VALIDATION to True in the eval_*.yml. and then run `python main.py --cfg cfg/eval_bird.yml --gpu 1`
- We compute inception score for models trained on birds using [StackGAN-inception-model](https://github.com/hanzhanggit/StackGAN-inception-model).
- We compute inception score for models trained on coco using [improved-gan/inception_score](https://github.com/openai/improved-gan/tree/master/inception_score).


**Examples generated by AttnGAN [[Blog]](https://blogs.microsoft.com/ai/drawing-ai/)**

 bird example              |  coco example
:-------------------------:|:-------------------------:
![](https://github.com/taoxugit/AttnGAN/blob/master/example_bird.png)  |  ![](https://github.com/taoxugit/AttnGAN/blob/master/example_coco.png)


### Creating an API
[Evaluation code](eval) embedded into a callable containerized API is included in the `eval\` folder.

### Using InterfaceGAN to customize bird generation
For a given bird attribute in `attributes.txt` , using [InterfaceGAN](https://arxiv.org/abs/2005.09635) we can obtain 
a direction for latent code manipulation, in order to make it more positive/negative for such attribute.

To obtain the direction as numpy array, `InterfaceGAN` needs a set of latent codes and their corresponding attribute 
values. The following files support that process:

* `batch_generate_birds.py` generates bird images using random latent codes. The latent codes are stored in
`noise_vectors_array.npy` and image information, including file location, is saved in the `metadata_file.csv` file.
* `organize_image_folder.py` will organise images in the [Caltech-UCSD Birds](http://www.vision.caltech.edu/visipedia/CUB-200.html)
into train and validation folders for an specific attribute from `attributes.txt`. This is needed for training a feature 
  predictor for that attribute.
* `train_feature_predictor.py` will train a transfer-learning based feature predictor, using the folder organised via 
`organize_image_folder.py` as data input. Model state will be stored in the `feature_predictor.pt` file.
* `batch_predict_feature.py` will predict the value of a feature using the model trained with 
  `train_feature_predictor.py`, over images generated using the `noise_vectors_array.npy` latent codes. 
  Features values will be stored in the `predictions.npy` numpy array.
  
We can later feed `noise_vectors_array.npy` and `predictions.npy` to the `train_boundary.py` module of `InterfaceGAN` 
to obtain the direction for attribute manipulation.

Once we have the boundary as a numpy array, can use the `AttnGAN/code/main.py` file for image generation and interpolation.
Use the `attnganw/config.py` to configure the interpolation parameters.

### Citing AttnGAN
If you find AttnGAN useful in your research, please consider citing:

```
@article{Tao18attngan,
  author    = {Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He},
  title     = {AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks},
  Year = {2018},
  booktitle = {{CVPR}}
}
```

**Reference**

- [StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks](https://arxiv.org/abs/1710.10916) [[code]](https://github.com/hanzhanggit/StackGAN-v2)
- [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/abs/1511.06434) [[code]](https://github.com/carpedm20/DCGAN-tensorflow)

Owner

  • Name: Carlos Gavidia-Calderon
  • Login: cptanalatriste
  • Kind: user
  • Location: London, United Kingdom
  • Company: @alan-turing-institute

Systems engineer by training, software developer by trade. Research Software Engineer at @alan-turing-institute .

GitHub Events

Total
Last Year