271-belm-high-quality-exact-inversion-sampler-of-diffusion-models
Science Score: 41.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.8%) to scientific vocabulary
Scientific Fields
Engineering
Computer Science -
80% confidence
Mathematics
Computer Science -
40% confidence
Last synced: 6 months ago
·
JSON representation
·
Repository
Basic Info
- Host: GitHub
- Owner: SZU-AdvTech-2024
- Default Branch: main
- Size: 0 Bytes
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Created about 1 year ago
· Last pushed about 1 year ago
Metadata Files
Citation
https://github.com/SZU-AdvTech-2024/271-BELM-High-quality-Exact-Inversion-sampler-of-Diffusion-Models/blob/main/
# BELM: High-quality Exact Inversion sampler of Diffusion ModelsThis repository is no the official implementation of the **NeurIPS 2024** paper: _"BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models"_ Keywords: Diffusion Model, Exact Inversion, ODE Solver > **Fangyikang Wang1, Hubery Yin2, Yuejiang Dong3, Huminhao Zhu1,
Chao Zhang1, Hanbin Zhao1, Hui Qian1, Chen Li2** > > 1Zhejiang University 2WeChat, Tencent Inc. 3Tsinghua University [](https://arxiv.org/abs/2410.07273) [](https://opensource.org/licenses/MIT) [](https://zhuanlan.zhihu.com/p/1379396199) [](https://hits.seeyoufarm.com)  ## What's New? ### We use the thought of bidirectional explicit to enable exact inversion  > **Schematic description** of DDIM (left) and BELM (right). DDIM uses $`\mathbf{x}_i`$ and $`\boldsymbol{\varepsilon}_\theta(\mathbf{x}_i,i)`$ to calculate $`\mathbf{x}_{i-1}`$ based on a linear relation between $`\mathbf{x}_i`$, $`\mathbf{x}_{i-1}`$ and $`\boldsymbol{\varepsilon}_\theta(\mathbf{x}_i,i)`$ (represented by the blue line). However, DDIM inversion uses $`\mathbf{x}_{i-1}`$ and $`\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i-1},i-1)`$ to calculate $`\mathbf{x}_{i}`$ based on a different linear relation represented by the red line. This mismatch leads to the inexact inversion of DDIM. In contrast, BELM seeks to establish a linear relation between $`\mathbf{x}_{i-1}`$, $`\mathbf{x}_i`$, $`\mathbf{x}_{i+1}`$ and $`\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i}, i)`$ (represented by the green line). BELM and its inversion are derived from this unitary relation, which facilitates the exact inversion. Specifically, BELM uses the linear combination of $`\mathbf{x}_i`$, $`\mathbf{x}_{i+1}`$ and $`\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i},i)`$ to calculate $`\mathbf{x}_{i-1}`$, and the BELM inversion uses the linear combination of $`\mathbf{x}_{i-1}`$, $`\mathbf{x}_i`$ and $`\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i},i)`$ to calculate $`\mathbf{x}_{i+1}`$. The bidirectional explicit constraint means this linear relation does not include the derivatives at the bidirectional endpoint, that is, $`\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i-1},i-1)`$ and $`\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i+1},i+1)`$. ### We introduce a generic formulation of the exact inversion samplers, BELM. the general k-step BELM: ```math \bar{\mathbf{x}}_{i-1} = \sum_{j=1}^{k} a_{i,j}\cdot \bar{\mathbf{x}}_{i-1+j} +\sum_{j=1}^{k-1}b_{i,j}\cdot h_{i-1+j}\cdot\bar{\boldsymbol{\varepsilon}}_\theta(\bar{\mathbf{x}}_{i-1+j},\bar{\sigma}_{i-1+j}). ``` 2-step BELM: ```math \bar{\mathbf{x}}_{i-1} = a_{i,2}\bar{\mathbf{x}}_{i+1} +a_{i,1}\bar{\mathbf{x}}_{i} + b_{i,1} h_i\bar{\boldsymbol{\varepsilon}}_\theta(\bar{\mathbf{x}}_i,\bar{\sigma}_i). ``` ### We derive the optimal coefficients for BELM via LTE minimization.
> **Proposition** The LTE $`\tau_i`$ of BELM diffusion sampler, which is given by $`\tau_i = \bar{\mathbf{x}}(t_{i-1}) - a_{i,2}\bar{\mathbf{x}}(t_{i+1}) -a_{i,1}\bar{\mathbf{x}}(t_{i}) - b_{i,1} h_i\bar{\boldsymbol{\varepsilon}}_\theta(\bar{\mathbf{x}}(t_i),\bar{\sigma}_i)`$, can be accurate up to $`\mathcal{O}\left({(h_{i}+h_{i+1})}^3\right)`$ when formulae are designed as $`a_{i,1} = \frac{h_{i+1}^2 - h_i^2}{h_{i+1}^2}`$,$`a_{i,2}=\frac{h_i^2}{h_{i+1}^2}`$,$`b_{i,1}=- \frac{h_i+h_{i+1}}{h_{i+1}} `$.where $`h_i = \frac{\sigma_i}{\alpha_i}-\frac{\sigma_{i-1}}{\alpha{i-1}}`$ the Optimal-BELM (O-BELM) sampler: ```math \mathbf{x}_{i-1} = \frac{h_i^2}{h_{i+1}^2}\frac{\alpha_{i-1}}{\alpha_{i+1}}\mathbf{x}_{i+1} +\frac{h_{i+1}^2 - h_i^2}{h_{i+1}^2}\frac{\alpha_{i-1}}{\alpha_{i}}\mathbf{x}_{i} - \frac{h_i(h_i+h_{i+1})}{h_{i+1}}\alpha_{i-1}\boldsymbol{\varepsilon}_\theta(\mathbf{x}_i,i). ``` The inversion of O-BELM diffusion sampler writes: ```math \mathbf{x}_{i+1}= \frac{h_{i+1}^2}{h_i^2}\frac{\alpha_{i+1}}{\alpha_{i-1}}\mathbf{x}_{i-1} + \frac{h_i^2-h_{i+1}^2}{h_i^2}\frac{\alpha_{i+1}}{\alpha_{i}}\mathbf{x}_{i}+\frac{h_{i+1}(h_i+h_{i+1})}{h_i}\alpha_{i+1} \boldsymbol{\varepsilon}_\theta(\mathbf{x}_i,i). ``` ## Run the code ### 1) Get start * Python 3.8.12 * CUDA 11.8 * NVIDIA V100 32GB * Torch 2.0.0 * Torchvision 0.15.0 ```shell conda create -n belm python=3.8 -y conda activate belm conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia pip install -r p2p_requirements.txt ``` Please follow **[diffusers](https://github.com/huggingface/diffusers)** to install diffusers. ### 2) Run first, please switch to the root directory. #### CIFAR10 sampling ```shell python3 ./scripts/cifar10.py --test_num 10 --batch_size 32 --num_inference_steps 100 --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxx/ddpm_ema_cifar10 ``` #### CelebA-HQ sampling ```shell python3 ./scripts/celeba.py --test_num 10 --batch_size 32 --num_inference_steps 100 --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxx/ddpm_ema_cifar10 ``` #### FID evaluation ```shell python3 ./scripts/celeba.py --test_num 10 --batch_size 32 --num_inference_steps 100 --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxx/ddpm_ema_cifar10 ``` #### intrpolation ```shell python3 ./scripts/interpolate.py --test_num 10 --batch_size 1 --num_inference_steps 100 --save_dir YOUR/SAVE/DIR --model_id xx ``` #### Reconstruction error calculation ```shell python3 ./scripts/reconstruction.py --test_num 10 --num_inference_steps 100 --directory WHERE/YOUR/IMAGES/ARE --sampler_type belm ``` #### Image editing ```shell python3 ./scripts/image_editing.py --num_inference_steps 200 --freeze_step 50 --guidance 2.0 --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxxxx/stable-diffusion-v1-5 --ori_im_path images/imagenet_dog_1.jpg --ori_prompt 'A dog' --res_prompt 'A Dalmatian' ``` #### Direct Inversion image editing ```shell python3 ./improve_edit/run_editing_p2p_one_image.py --image_path ./images/farm.png --original_prompt "a farm" --editing_prompt "an lighthouse on farm" --blended_word "farm farm" --output_path YOUR/SAVE/DIR --edit_method_list "directinversion+p2p_guidance_75_75" ``` ## License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## Citation If our work assists your research, feel free to give us a star or cite us using: ``` @article{wang2024belm, title={BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models}, author={Wang, Fangyikang and Yin, Hubery and Dong, Yuejiang and Zhu, Huminhao and Zhang, Chao and Zhao, Hanbin and Qian, Hui and Li, Chen}, journal={arXiv preprint arXiv:2410.07273}, year={2024} } @article{ju2023direct, title={PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code}, author={Ju, Xuan and Zeng, Ailing and Bian, Yuxuan and Liu, Shaoteng and Xu, Qiang}, journal={International Conference on Learning Representations ({ICLR})}, year={2024} } ```
Owner
- Name: SZU-AdvTech-2024
- Login: SZU-AdvTech-2024
- Kind: organization
- Repositories: 1
- Profile: https://github.com/SZU-AdvTech-2024
Citation (citation.txt)
@inproceedings{REPO271,
author = "Wang, Fangyikang and Yin, Hubery and Dong, Yue-Jiang and Zhu, Huminhao and Zhang, Chao and Zhao, Hanbin and Qian, Hui and Li, Chen",
booktitle = "The Thirty-eighth Annual Conference on Neural Information Processing Systems",
title = "{{BELM}: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models}",
year = "2024"
}
GitHub Events
Total
- Watch event: 1
- Push event: 3
- Create event: 3
Last Year
- Watch event: 1
- Push event: 3
- Create event: 3