https://github.com/aim-uofa/styledrop-pytorch
This is an unofficial PyTorch implementation of StyleDrop: Text-to-Image Generation in Any Style.
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.2%) to scientific vocabulary
Repository
This is an unofficial PyTorch implementation of StyleDrop: Text-to-Image Generation in Any Style.
Basic Info
Statistics
- Stars: 221
- Watchers: 10
- Forks: 15
- Open Issues: 7
- Releases: 0
Metadata Files
README.md
StyleDrop
This is an unofficial PyTorch implementation of StyleDrop: Text-to-Image Generation in Any Style.
Unlike the parameters in the paper in (Round 1), we set $\lambdaA=2.0$, $\lambdaB=5.0$ and d_prj=32, is_shared=False, which we found work better, these hyperparameters can be seen in configs/custom.py.
we release them to facilitate community research.

News
- [07/06/2023] Online Gradio Demo is available here
Todo List
- [x] Release the code.
- [x] Add gradio inference demo (runs in local).
- [ ] Add iterative training (Round 2).
Data & Weights Preparation
First, download VQGAN from this link (from MAGE, thanks!), and put the downloaded VQGAN in assets/vqgan_jax_strongaug.ckpt.
Then, download the pre-trained checkpoints from this link to assets/ckpts for evaluation or to continue training for more iterations.
finally, prepare emptyfeature by runnig command `python extractempty_feature.py`
And the final directory structure is as follows: ``` . ├── assets │ ├── ckpts │ │ ├── cc3m-285000.ckpt │ │ │ ├── lrscheduler.pth │ │ │ ├── nnetema.pth │ │ │ ├── nnet.pth │ │ │ ├── optimizer.pth │ │ │ └── step.pth │ │ └── imagenet256-450000.ckpt │ │ ├── lrscheduler.pth │ │ ├── nnetema.pth │ │ ├── nnet.pth │ │ ├── optimizer.pth │ │ └── step.pth │ ├── fidstats │ │ ├── fidstatscc3mval.npz │ │ └── fidstatsimagenet256guideddiffusion.npz │ ├── pipeline.png | ├── contexts │ │ └── emptycontext.npy └── └── vqganjax_strongaug.ckpt
```
Dependencies
Same as MUSE-PyTorch.
conda install pytorch torchvision torchaudio cudatoolkit=11.3
pip install accelerate==0.12.0 absl-py ml_collections einops wandb ftfy==6.1.1 transformers==4.23.1 loguru webdataset==0.2.5 gradio
Train
All style data in the paper are placed in the data directory
- Modify
data/one_style.json(It should be noted thatone_style.jsonandstyle datamust be in the same directory), The format isfile_name:[object,style]
json
{"image_03_05.jpg":["A bear","in kid crayon drawing style"]}
2. Training script as follows.
```shell
!/bin/bash
unset EVALCKPT unset ADAPTER export OUTPUTDIR="outputdir/for/this/experiment" accelerate launch --numprocesses 8 --mixedprecision fp16 traint2icustomv2.py --config=configs/custom.py ```
Inference
The pretrained style_adapter weights can be downloaded from 🤗 Hugging Face. ```shell
!/bin/bash
export EVALCKPT="assets/ckpts/cc3m-285000.ckpt" export ADAPTER="path/to/your/styleadapter"
export OUTPUT_DIR="output/for/this/experiment"
accelerate launch --numprocesses 8 --mixedprecision fp16 traint2icustom_v2.py --config=configs/custom.py ```
Gradio Demo
Put the style_adapter weights in ./style_adapter folder and run the following command will launch the demo:
shell
python gradio_demo.py
The demo is also hosted on HuggingFace.
Citation
bibtex
@article{sohn2023styledrop,
title={StyleDrop: Text-to-Image Generation in Any Style},
author={Sohn, Kihyuk and Ruiz, Nataniel and Lee, Kimin and Chin, Daniel Castro and Blok, Irina and Chang, Huiwen and Barber, Jarred and Jiang, Lu and Entis, Glenn and Li, Yuanzhen and others},
journal={arXiv preprint arXiv:2306.00983},
year={2023}
}
Acknowlegment
- The implementation is based on MUSE-PyTorch
- Many thanks for the generous help from Zanlin Ni
Star History
Owner
- Name: Advanced Intelligent Machines (AIM)
- Login: aim-uofa
- Kind: organization
- Location: China
- Repositories: 23
- Profile: https://github.com/aim-uofa
A research team at Zhejiang University, focusing on Computer Vision and broad AI research ...
GitHub Events
Total
- Issues event: 1
- Watch event: 22
- Fork event: 3
Last Year
- Issues event: 1
- Watch event: 22
- Fork event: 3
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 10
- Total pull requests: 0
- Average time to close issues: 18 days
- Average time to close pull requests: N/A
- Total issue authors: 10
- Total pull request authors: 0
- Average comments per issue: 0.6
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- carldaws (1)
- maxin-cn (1)
- fypbiqi (1)
- vicc (1)
- yatoubusha (1)
- yxxshin (1)
- AmanKishore (1)
- Laidawang (1)
- EnigmaHong (1)