volumegan

CVPR 2022 VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations

https://github.com/genforce/volumegan

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.9%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

CVPR 2022 VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations

Basic Info

Host: GitHub
Owner: genforce
License: other
Language: Python
Default Branch: main
Homepage:
Size: 5.19 MB

Statistics

Stars: 128
Watchers: 14
Forks: 14
Open Issues: 5
Releases: 0

Created over 4 years ago · Last pushed over 3 years ago

Metadata Files

Readme Contributing License Citation

VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations

Figure: Framework of VolumeGAN.

3D-aware Image Synthesis via Learning Structural and Textural Representations
Yinghao Xu, Sida Peng, Ceyuan Yang, Yujun Shen, Bolei Zhou
Computer Vision and Pattern Recognition (CVPR), 2022

[Paper] [Project Page] [Demo]

This paper aims at achieving high-fidelity 3D-aware images synthesis. We propose a novel framework, termed as VolumeGAN, for synthesizing images under different camera views, through explicitly learning a structural representation and a textural representation. We first learn a feature volume to represent the underlying structure, which is then converted to a feature field using a NeRF-like model. The feature field is further accumulated into a 2D feature map as the textural representation, followed by a neural renderer for appearance synthesis. Such a design enables independent control of the shape and the appearance. Extensive experiments on a wide range of datasets show that our approach achieves sufficiently higher image quality and better 3D control than the previous methods.

Usage

Setup

This repository is based on Hammer, where you can find detailed instructions on environmental setup.

Test Demo

shell python render.py \ --work_dir ${WORK_DIR} \ --checkpoint ${MODEL_PATH} \ --num ${NUM} \ --seed ${SEED} \ --render_mode ${RENDER_MODE} \ --generate_html ${SAVE_HTML} \ volumegan-ffhq

where

WORK_DIR refers to the path to save the results.
MODEL_PATH refers to the path of the pretrained model, regarding which we provide
- FFHQ-256
NUM refers to the number of samples to synthesize.
SEED refers to the random seed used for sampling.
RENDER_MODE refers to the type of the rendered results, including video and shape.
SAVE_HTML controls whether to save images as an HTML for better visualization when rendering videos.

Training

For example, users can use the following command to train VolumeGAN on FFHQ in the resolution of 256x256

shell ./scripts/training_demos/volumegan_ffhq256.sh \ ${NUM_GPUS} \ ${DATA_PATH} \ [OPTIONS]

where

NUM_GPUS refers to the number of GPUs used for training.
DATA_PATH refers to the path to the dataset (zip format is strongly recommended).
[OPTIONS] refers to any additional option to pass. Detailed instructions on available options can be found via python train.py volumegan-ffhq --help.

NOTE: This demo script uses volumegan_ffhq256 as the default job_name, which is particularly used to identify experiments. Concretely, a directory with name job_name will be created under the root working directory, which is set as work_dirs/ by default. To prevent overwriting previous experiments, an exception will be raised to interrupt the training if the job_name directory has already existed. Please use --job_name=${JOB_NAME} option to specify a new job name.

Evaluation

Users can use the following command to evaluate a well-trained model

shell ./scripts/test_metrics.sh \ ${NUM_GPUS} \ ${DATA_PATH} \ ${MODEL_PATH} \ fid \ --G_kwargs '{"ps_kwargs":'{"perturb_mode":"none"}'}' \ [OPTIONS]

BibTeX

bibtex @inproceedings{xu2021volumegan, title = {3D-aware Image Synthesis via Learning Structural and Textural Representations}, author = {Xu, Yinghao and Peng, Sida and Yang, Ceyuan and Shen, Yujun and Zhou, Bolei}, booktitle = {CVPR}, year = {2022} }

Owner

Name: GenForce: May Generative Force Be with You
Login: genforce
Kind: organization
Location: The Chinese University of Hong Kong

Website: https://genforce.github.io/
Repositories: 18
Profile: https://github.com/genforce

Research on Generative Modeling in Zhou Group

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Shen"
  given-names: "Yujun"
- family-names: "Zhang"
  given-names: "Zhiyi"
- family-names: "Yang"
  given-names: "Dingdong"
- family-names: "Xu"
  given-names: "Yinghao"
- family-names: "Yang"
  given-names: "Ceyuan"
- family-names: "Zhu"
  given-names: "Jiapeng"
title: "Hammer: An Efficient Toolkit for Training Deep Models"
version: 1.0.0
date-released: 2022-02-08
url: "https://github.com/bytedance/Hammer"

GitHub Events

Total

Watch event: 3

Last Year

Watch event: 3

Dependencies

requirements/convert.txt pypi

bs4 *
easydict *
ninja ==1.10.2
opencv-python-headless ==4.5.5.62
pillow ==9.0.0
requests *
rich *
scikit-video ==1.1.11
tensorflow-gpu ==1.15
torch ==1.8.1
tqdm *

requirements/develop.txt pypi

bpytop * development
gpustat * development
pylint * development

requirements/minimal.txt pypi

bs4 *
click *
cloup *
easydict *
einops *
lmdb *
matplotlib *
mrcfile *
ninja ==1.10.2
numpy ==1.21.5
opencv-python-headless ==4.5.5.62
pillow ==9.0.0
psutil *
pymcubes *
requests *
rich *
scikit-learn ==1.0.2
scikit-video ==1.1.11
scipy ==1.7.3
tensorboard ==2.7.0
torch-tb-profiler ==0.3.1
tqdm *
trimesh *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science