https://github.com/aaltoml/view-aware-inference

Gaussian Process Priors for View-Aware Inference

https://github.com/aaltoml/view-aware-inference

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.0%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Gaussian Process Priors for View-Aware Inference

Basic Info
Statistics
  • Stars: 2
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 6 years ago · Last pushed about 5 years ago

https://github.com/AaltoML/view-aware-inference/blob/master/

# Gaussian Process Priors for View-Aware Inference

Code for the paper:
* Hou, Y., Heljakka, A., Kannala, J., and Solin, A. (2020). **Gaussian Process Priors for View-Aware Inference**. In *Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), to appear*. [[arXiv preprint]](https://arxiv.org/abs/1912.03249)

Implementation by **Ari Heljakka** and **Yuxin Hou**. (StyleGAN, perceptual similarity metrics and StyleGAN projection code adapted from [1-3]. GPPVAE code adapted from [4])

## Data

For view synthesis with a GP Prior VAE, please download the data from [Google Drive](https://drive.google.com/file/d/19WPAU8VcPJ5UgLz_qdi_n97odMx-L21s/view?usp=sharing). The dataset contains a h5df file(data_chairs.h5) and a numpy array that store the camera poses for 60 different viewpoints for training(pose60.npy), and 60 novel viewpoints for evaluation(seq_pose.npy)

For face reconstruction, please download the dataset from [Google Drive](https://drive.google.com/drive/folders/1lc2YVHq_ZYHsbHRorf9K_KBSTT-fG_Kg).
The dataset contains the four full camera runs, created by only keeping every 5th frame.
These PNGs have already been aligned and cropped to 512x512 so as to be suitable for StyleGAN projection.
The StyleGAN latent codes are already included, so you can skip the non-deterministic projection step. Still, should you wish to do so for your new images, please run `python project_z.py  --img_input_path [your data]`.

## System Requirements

A GPU with 11 GB of RAM is required for StyleGAN generation (and optional projection) steps.

For all dependencies of face reconstruction experiments, please run 
```
conda install --file requirements. txt
```

For view synthesis with a GP Prior VAE experiments, simply install `h5py`.

## Training the VAE models
First, we train the basic VAE model
```
python ./gppvae/train_vae.py --data data_chairs.h5 --outdir ./out/vae
```
Then, we train the GPPVAE
```
python ./gppvae/train_gppvae.py --data data_chairs.h5 --pose pose60.npy --outdir ./out/gppvae --vae_cfg ./out/vae/vae.cfg.p --vae_weights ./out/vae/weights/weights.00990.pt
```

## Reproducing the Paper Results for View Synthesis with a GP Prior VAE

First, please download our pre-trained model from [Google Drive](https://drive.google.com/file/d/1DVg0CT1WlhipxflPGjwj_8jy0r6MxPCG/view?usp=sharing)

Then, to test the performance of the original task, run
```
python ./gppvae/test_vae.py --data data_chairs.h5 --pose pose60.npy --vae_cfg out/vae/vae.cfg.p \
--vae_weights ./out/gppvae/weights/vae_weights.00110.pt \
--gp_weights ./out/gppvae/weights/gp_weights.00110.pt \
--vm_weights ./out/gppvae/weights/vm_weights.00110.pt 
```

To predict novel views that are not presented in the training set, please check the notebook file `./gppvae/novel_view_prediction.ipynb`

## Reproducing the Paper Results for Face Reconstruction

For the face reconstruction results, please run 
```
./evaluate.sh [data path]
```

This command runs the following steps:
1. Load frames.csv, convert it into distance matrices for GP operations, and execute all interpolation modes, including the baselines, to produce the corrected latent Z matrices usable for StyleGAN.
2. Decode each Z array with StyleGAN to X image frames.
3. For quantitative metrics of images, crop every image in order to run LPIPS on the X frames, both between methods and within the consequtive frames produced by each method.
4. Compute the statistics into lpips_... files.
5. Gather and print out the statistics.

You can run step #1 manually for e.g. face id #2 with (first, last) interpolation with
```
python eval.py --data_path ./data  --face_id=2
```

For the baseline separable (Euler) kernels and the quaternion kernels, run with `--kernel_mode=[quat|euler]`.
For the smoothing with all frames, run with `--full_smoothing`.

For other steps, please see `evaluate.sh` for example commands.

# References

[1] Karras, T., Laine, S., Aila, T. (2019). **A Style-Based Generator Architecture for Generative Adversarial Networks**. In: *Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*. https://github.com/tkarras/progressive_growing_of_gans

[2] Zhang, R. and Isola, P. and Efros, A. A. and Shechtman, E. and Wang, O. (2018). **The Unreasonable Effectiveness of Deep Features as a Perceptual Metric** In: *CVPR*. https://github.com/richzhang/PerceptualSimilarity

[3] https://github.com/Puzer/stylegan-encoder

[4] Casale, F. P., Dalca, A. V., Saglietti, L., Listgarten, J., & Fusi, N. (2018). **Gaussian process prior variational autoencoders** In *NeurIPS*.   https://github.com/fpcasale/GPPVAE

[5] https://shapenet.org/

Owner

  • Name: AaltoML
  • Login: AaltoML
  • Kind: organization
  • Location: Finland

Machine learning group at Aalto University lead by Prof. Solin

GitHub Events

Total
Last Year

Dependencies

requirements.txt pypi
  • absl-py =0.11.0=pyhd8ed1ab_0
  • astor =0.8.1=pyh9f0ad1d_0
  • backcall =0.2.0=pypi_0
  • blas =1.0=mkl
  • brotlipy =0.7.0=py37h27cfd23_1003
  • bzip2 =1.0.8=h7f98852_4
  • c-ares =1.17.1=h36c2ea0_0
  • ca-certificates =2021.1.19=h06a4308_0
  • cached-property =1.5.2=pypi_0
  • cairo =1.16.0=h7979940_1007
  • certifi =2020.12.5=py37h06a4308_0
  • cffi =1.14.4=py37h261ae71_0
  • chardet =4.0.0=py37h06a4308_1003
  • cloudpickle =1.6.0=py_0
  • cryptography =3.3.1=py37h3c74f83_0
  • cudatoolkit =10.0.130=hf841e97_6
  • cudnn =7.6.5.32=ha8d7eb6_1
  • cupti =10.0.130=0
  • cycler =0.10.0=py37_0
  • cytoolz =0.11.0=py37h7b6447c_0
  • dask-core =2021.1.0=pyhd3eb1b0_0
  • dbus =1.13.18=hb2f20db_0
  • decorator =4.4.2=py_0
  • expat =2.2.10=he6710b0_2
  • ffmpeg =4.3.1=hca11adc_2
  • fontconfig =2.13.1=hba837de_1004
  • freetype =2.10.4=h5ab3b9f_0
  • future =0.18.2=py37h89c1867_3
  • gast =0.2.2=py_0
  • gettext =0.19.8.1=h0b5b191_1005
  • glib =2.66.4=hcd2ae1e_1
  • gmp =6.2.1=h58526e2_0
  • gnutls =3.6.13=h85f3911_1
  • google-pasta =0.2.0=pyh8c360ce_0
  • graphite2 =1.3.13=h58526e2_1001
  • grpcio =1.34.0=pypi_0
  • gst-plugins-base =1.14.5=h0935bb2_2
  • gstreamer =1.18.3=h3560a44_0
  • h5py =3.1.0=nompi_py37h1e651dc_100
  • harfbuzz =2.7.4=h5cf4720_0
  • hdf5 =1.10.6=nompi_h3c11f04_101
  • icu =68.1=h58526e2_0
  • idna =2.10=py_0
  • imageio =2.9.0=py_0
  • importlib-metadata =3.3.0=pypi_0
  • intel-openmp =2020.2=254
  • ipdb =0.13.4=pypi_0
  • ipython =7.19.0=pypi_0
  • ipython-genutils =0.2.0=pypi_0
  • jasper =1.900.1=h07fcdf6_1006
  • jedi =0.18.0=pypi_0
  • jpeg =9d=h36c2ea0_0
  • keras-applications =1.0.8=py_1
  • keras-preprocessing =1.1.2=pyhd8ed1ab_0
  • kiwisolver =1.3.0=py37h2531618_0
  • krb5 =1.17.2=h926e7f8_0
  • lame =3.100=h7f98852_1001
  • lcms2 =2.11=h396b838_0
  • ld_impl_linux-64 =2.33.1=h53a641e_7
  • libblas =3.8.0=21_mkl
  • libcblas =3.8.0=21_mkl
  • libclang =11.0.1=default_ha53f305_1
  • libedit =3.1.20191231=h14c3975_1
  • libevent =2.1.10=hcdb4288_3
  • libffi =3.3=he6710b0_2
  • libgcc-ng =9.3.0=h2828fa1_18
  • libgfortran-ng =7.3.0=hdf63c60_0
  • libglib =2.66.4=h164308a_1
  • libgomp =9.3.0=h2828fa1_18
  • libiconv =1.16=h516909a_0
  • liblapack =3.8.0=21_mkl
  • liblapacke =3.8.0=21_mkl
  • libllvm11 =11.0.1=hf817b99_0
  • libopencv =4.5.0=py37h90094e2_7
  • libpng =1.6.37=hbc83047_0
  • libpq =12.3=h255efa7_3
  • libprotobuf =3.14.0=h780b84a_0
  • libstdcxx-ng =9.3.0=h6de172a_18
  • libtiff =4.1.0=h2733197_1
  • libuuid =2.32.1=h7f98852_1000
  • libuv =1.40.0=h7b6447c_0
  • libwebp-base =1.1.0=h36c2ea0_3
  • libxcb =1.14=h7b6447c_0
  • libxkbcommon =1.0.3=he3ba5ed_0
  • libxml2 =2.9.10=h72842e0_3
  • llvm-openmp =11.0.1=h4bd325d_0
  • lz4-c =1.9.2=heb0550a_3
  • markdown =3.3.3=pyh9f0ad1d_0
  • matplotlib =3.3.2=h06a4308_0
  • matplotlib-base =3.3.2=py37h817c723_0
  • mkl =2020.4=h726a3e6_304
  • mkl-service =2.3.0=py37he8ac12f_0
  • mkl_fft =1.2.0=py37h23d657b_0
  • mkl_random =1.1.1=py37h0573a6f_0
  • mysql-common =8.0.22=ha770c72_1
  • mysql-libs =8.0.22=h1fd7589_1
  • ncurses =6.2=he6710b0_1
  • nettle =3.6=he412f7d_0
  • networkx =2.5=py_0
  • ninja =1.10.2=py37hff7bd54_0
  • nspr =4.29=h9c3ff4c_1
  • nss =3.60=hb5efdd6_0
  • numpy =1.19.2=py37h54aff64_0
  • numpy-base =1.19.2=py37hfa32c7d_0
  • olefile =0.46=py_0
  • opencv =4.5.0=py37h89c1867_7
  • openh264 =2.1.1=h780b84a_0
  • openssl =1.1.1i=h27cfd23_0
  • opt-einsum =3.3.0=pypi_0
  • opt_einsum =3.3.0=py_0
  • pandas =1.2.1=py37ha9443f7_0
  • parso =0.8.1=pypi_0
  • pcre =8.44=he6710b0_0
  • pexpect =4.8.0=pypi_0
  • pickleshare =0.7.5=pypi_0
  • pillow =8.0.1=py37he98fc37_0
  • pip =20.3.3=py37h06a4308_0
  • pixman =0.40.0=h36c2ea0_0
  • prompt-toolkit =3.0.10=pypi_0
  • protobuf =3.14.0=pypi_0
  • ptyprocess =0.7.0=pypi_0
  • py-opencv =4.5.0=py37h888b3d9_7
  • pycparser =2.20=py_2
  • pygments =2.7.3=pypi_0
  • pyopenssl =20.0.1=pyhd3eb1b0_1
  • pyparsing =2.4.7=py_0
  • pyqt =5.12.3=py37h89c1867_7
  • pyqt-impl =5.12.3=py37he336c9b_7
  • pyqt5-sip =4.19.18=py37hcd2ae1e_7
  • pyqtchart =5.12=py37he336c9b_7
  • pyqtwebengine =5.12.1=py37he336c9b_7
  • pysocks =1.7.1=py37_1
  • python =3.7.9=h7579374_0
  • python-dateutil =2.8.1=py_0
  • python_abi =3.7=1_cp37m
  • pytorch =1.7.1=cpu_py37hf1c21f6_1
  • pytz =2020.5=pyhd3eb1b0_0
  • pywavelets =1.1.1=py37h7b6447c_2
  • pyyaml =5.4.1=py37h27cfd23_1
  • qt =5.12.9=h9d6b050_2
  • readline =8.0=h7b6447c_0
  • requests =2.25.1=pyhd3eb1b0_0
  • scikit-image =0.17.2=py37hdf5156a_0
  • scipy =1.5.2=py37h0b6359f_0
  • setuptools =51.0.0=py37h06a4308_2
  • sip =4.19.8=py37hf484d3e_0
  • six =1.15.0=py37h06a4308_0
  • sqlite =3.34.0=h74cdb3f_0
  • tensorboard =1.15.0=py37_0
  • tensorflow =1.15.0=gpu_py37h0f0df58_0
  • tensorflow-base =1.15.0=gpu_py37h9dcbed7_0
  • tensorflow-estimator =1.15.1=pyh2649769_0
  • tensorflow-gpu =1.15.0=h0d30ee6_0
  • termcolor =1.1.0=pypi_0
  • tifffile =2020.10.1=py37hdd07704_2
  • tk =8.6.10=hbc83047_0
  • toolz =0.11.1=pyhd3eb1b0_0
  • torchvision =0.2.1=py37_0
  • tornado =6.1=py37h27cfd23_0
  • traitlets =5.0.5=pypi_0
  • typing_extensions =3.7.4.3=py_0
  • urllib3 =1.26.2=pyhd3eb1b0_0
  • wcwidth =0.2.5=pypi_0
  • werkzeug =1.0.1=pypi_0
  • wheel =0.36.2=pyhd3eb1b0_0
  • wrapt =1.12.1=py37h5e8e339_3
  • x264 =1
  • xorg-kbproto =1.0.7=h7f98852_1002
  • xorg-libice =1.0.10=h516909a_0
  • xorg-libsm =1.2.3=h84519dc_1000
  • xorg-libx11 =1.6.12=h516909a_0
  • xorg-libxext =1.3.4=h516909a_0
  • xorg-libxrender =0.9.10=h516909a_1002
  • xorg-renderproto =0.11.1=h14c3975_1002
  • xorg-xextproto =7.3.0=h7f98852_1002
  • xorg-xproto =7.0.31=h7f98852_1007
  • xz =5.2.5=h7b6447c_0
  • yaml =0.2.5=h7b6447c_0
  • zipp =3.4.0=py_0
  • zlib =1.2.11=h7b6447c_3
  • zstd =1.4.5=h9ceee32_0