brushedit

[TPAMI under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"

https://github.com/tencentarc/brushedit

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
✓
Committers with academic emails
1 of 1 committers (100.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.0%) to scientific vocabulary

Keywords

diffusion-models image-editing image-inpainting

Last synced: 6 months ago · JSON representation

Repository

[TPAMI under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"

Basic Info

Host: GitHub
Owner: TencentARC
License: other
Language: Python
Default Branch: main
Homepage: https://liyaowei-stu.github.io/project/BrushEdit/
Size: 53.3 MB

Statistics

Stars: 567
Watchers: 7
Forks: 27
Open Issues: 11
Releases: 0

Topics

diffusion-models image-editing image-inpainting

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme Contributing License Code of conduct Citation

BrushEdit

** Please check out our latest DiT-based image customization project IC-Custom, which provides powerful ID-Consistent editing capabilities!**

This repository contains the implementation of "BrushEdit: All-In-One Image Inpainting and Editing".

Keywords: Image Inpainting, Image Generation, Image Editing, Diffusion Models, MLLM Agent, Instruction-basd Editing

TL;DR: BrushEdit is an advanced, unified AI agent for image inpainting and editing.
Main Elements: Fully automated / Interactive editing.

Yaowei Li¹, Yuxuan Bian³, Xuan Ju³, Zhaoyang Zhang², Junhao Zhuang⁴, Ying Shan², Yuexian Zou¹
, Qiang Xu³
¹Peking University ²ARC Lab, Tencent PCG ³The Chinese University of Hong Kong ⁴Tsinghua University
Equal Contribution Project Lead Corresponding Author

https://github.com/user-attachments/assets/fde82f21-8b36-4584-8460-c109c195e614

4K HD Introduction Video: Youtube.

** Table of Contents**

BrushEdit

TODO

[X] Release the code of BrushEdit. (MLLM-dirven Agent for Image Editing and Inpainting)
[X] Release the paper and webpage. More info: BrushEdit
[X] Release the BrushNetX checkpoint(a more powerful BrushNet).
[X] Release gradio demo.

Pipeline Overview

BrushEdit consists of four main steps: (i) Editing category classification: determine the type of editing required. (ii) Identification of the primary editing object: Identify the main object to be edited. (iii) Acquisition of the editing mask and target Caption: Generate the editing mask and corresponding target caption. (iv) Image inpainting: Perform the actual image editing. Steps (i) to (iii) utilize pre-trained MLLMs and detection models to ascertain the editing type, target object, editing masks, and target caption. Step (iv) involves image editing using the dual-branch inpainting model improved BrushNet. This model inpaints the target areas based on the target caption and editing masks, leveraging the generative potential and background preservation capabilities of inpainting models.

teaser

Getting Started

Environment Requirement

BrushEdit has been implemented and tested on CUDA118, Pytorch 2.0.1, python 3.10.6.

Clone the repo:

git clone https://github.com/TencentARC/BrushEdit.git

We recommend you first use conda to create virtual environment, and install pytorch following official instructions. For example:

conda create -n brushedit python=3.10.6 -y conda activate brushedit python -m pip install --upgrade pip pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

Then, you can install diffusers (implemented in this repo) with:

pip install -e .

After that, you can install required packages thourgh:

pip install -r app/requirements.txt

Download Checkpoints

Checkpoints of BrushEdit can be downloaded using the following command.

sh app/down_load_brushedit.sh

The ckpt folder contains

BrushNetX pretrained checkpoints for Stable Diffusion v1.5 (brushnetX)
Pretrained Stable Diffusion v1.5 checkpoint (e.g., realisticVisionV60B1v51VAE from Civitai). You can use `scripts/convertoriginalstablediffusiontodiffusers.py` to process other models downloaded from Civitai.
Pretrained GroundingDINO checkpoint from offical.
Pretrained SAM checkpoint from offical.

The checkpoint structure should be like:

``` |-- models |-- basemodel |-- realisticVisionV60B1v51VAE |-- modelindex.json |-- vae |-- ... |-- dreamshaper8 |-- ... |-- epicrealismnaturalSinRC1VAE |-- ... |-- meinamixmeinaV11 |-- ... |-- ... |-- brushnetX |-- config.json |-- diffusionpytorchmodel.safetensors |-- groundingdino |-- groundingdinoswintogc.pth |-- sam |-- samvith4b8939.pth |-- vlm |-- llava-v1.6-mistral-7b-hf |-- ... |-- llava-v1.6-vicuna-13b-hf |-- ... |-- Qwen2-VL-7B-Instruct |-- ... |-- ...

```

We provide five base diffusion models, including:

Dreamshapre_8 is a versatile model that can generate impressive portraits and landscape images.
Epicrealism_naturalSinRC1VAE is a realistic style model that excels at generating portraits
HenmixReal_v5c is a model that specializes in generating realistic images of women.
Meinamix_meinaV11 is a model that excels at generating images in an animated style.
RealisticVisionV60B1_v51VAE is a highly generalized realistic style model.

The BrushNetX checkpoint represents an enhanced version of BrushNet, having been trained on a more diverse dataset to improve its editing capabilities, such as deletion and replacement.

We provide two VLM models, including Qwen2-VL-7B-Instruct and LLama3-LLaa-next-8b-hf. We strongly recommend using GPT-4o for reasoning. After selecting the VLM model as gpt4-o, enter the API KEY and click the Submit and Verify button. If the output is success, you can use gpt4-o normally. Secondarily, we recommend using the Qwen2VL model.

And you can download more prefromhuggingfacehubimporthfhubdownload, snapshotdownloadtrained VLMs model from QwenVL and LLaVA-Next.

Running Scripts

BrushEidt demo

You can run the demo using the script:

sh app/run_app.sh

Demo Features

demo_vis

Fundamental Features:

Aspect Ratio: Select the aspect ratio of the image. To prevent OOM, 1024px is the maximum resolution.
VLM Model: Select the VLM model. We use preloaded models to save time. To use other VLM models, download them and uncomment the relevant lines in vlm_template.py from our GitHub repo.
Generate Mask: According to the input instructions, generate a mask for the area that may need to be edited.
Square/Circle Mask: Based on the existing mask, generate masks for squares and circles. (The coarse-grained mask provides more editing imagination.)
Invert Mask: Invert the mask to generate a new mask.
Dilation/Erosion Mask: Expand or shrink the mask to include or exclude more areas.
Move Mask: Move the mask to a new position.
Generate Target Prompt: Generate a target prompt based on the input instructions.
Target Prompt: Description for masking area, manual input or modification can be made when the content generated by VLM does not meet expectations.
Blending: Blending brushnet's output and the original input, ensuring the original image details in the unedited areas. (turn off is beeter when removing.)
Control length: The intensity of editing and inpainting.

Advanced Features:

Base Model: We use preloaded models to save time. To use other VLM models, download them and uncomment the relevant lines in vlm_template.py from our GitHub repo.
Blending: Blending brushnet's output and the original input, ensuring the original image details in the unedited areas. (turn off is beeter when removing.)
Control length: The intensity of editing and inpainting.
Num samples: The number of samples to generate.
Negative prompt: The negative prompt for the classifier-free guidance.
Guidance scale: The guidance scale for the classifier-free guidance.

Cite Us

``` @misc{li2024brushedit, title={BrushEdit: All-In-One Image Inpainting and Editing}, author={Yaowei Li and Yuxuan Bian and Xuan Ju and Zhaoyang Zhang and and Junhao Zhuang and Ying Shan and Yuexian Zou and Qiang Xu}, year={2024}, eprint={2412.10316}, archivePrefix={arXiv}, primaryClass={cs.CV} }

```

Acknowledgement

Our code is modified based on diffusers and BrushNet here, thanks to all the contributors!

Contact

For any question, feel free to email liyaowei01@gmail.com.

Star History

Owner

Name: ARC Lab, Tencent PCG
Login: TencentARC
Kind: organization
Email: arc@tencent.com

Website: https://arc.tencent.com
Repositories: 44
Profile: https://github.com/TencentARC

GitHub Events

Total

Commit comment event: 1
Issues event: 41
Watch event: 539
Issue comment event: 46
Push event: 20
Public event: 2
Fork event: 27

Last Year

Commit comment event: 1
Issues event: 41
Watch event: 539
Issue comment event: 46
Push event: 20
Public event: 2
Fork event: 27

Committers

Last synced: 8 months ago

All Time

Total Commits: 20
Total Committers: 1
Avg Commits per committer: 20.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 20
Committers: 1
Avg Commits per committer: 20.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
liyaowei-stu	y**l@s**n	20

Committer Domains (Top 20 + Academic)

stu.pku.edu.cn: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 24
Total pull requests: 0
Average time to close issues: 5 days
Average time to close pull requests: N/A
Total issue authors: 19
Total pull request authors: 0
Average comments per issue: 1.75
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 24
Pull requests: 0
Average time to close issues: 5 days
Average time to close pull requests: N/A
Issue authors: 19
Pull request authors: 0
Average comments per issue: 1.75
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

mrlihellohorld (3)
robbxu (2)
gecade (2)
1091492188 (2)
yincangshiwei (1)
kada0720 (1)
rishipandey125 (1)
juxingyiwan (1)
shellin-star (1)
bengen-y (1)
hdjsjyl (1)
Twinkle-ce (1)
shikasensei-dev (1)
Jandown (1)
Fuuuuuuge (1)

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

app/requirements.txt pypi

Pillow ==9.5.0
accelerate ==0.26.0
clip *
datasets ==3.1.0
fastapi ==0.112.4
ftfy ==6.1.1
gradio ==4.38.1
hpsv2 *
huggingface_hub ==0.23.2
image-reward *
imgaug ==0.4.0
open-clip-torch *
openai *
opencv-python ==4.8.1.78
qwen_vl_utils *
segment_anything *
tensorboard *
transformers ==4.46.3

examples/advanced_diffusion_training/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
peft ==0.7.0
tensorboard *
torchvision *
transformers >=4.25.1

examples/brushnet/requirements.txt pypi

Pillow ==9.5.0
accelerate ==0.20.3
clip *
datasets *
ftfy *
gradio ==4.44.1
hpsv2 *
image-reward *
imgaug *
open-clip-torch *
opencv-python *
segment_anything *
tensorboard *
torchmetrics *
torchvision *
transformers >=4.25.1

examples/consistency_distillation/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1
webdataset *

examples/controlnet/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/controlnet/requirements_flax.txt pypi

Jinja2 *
datasets *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/controlnet/requirements_sdxl.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1
wandb *

examples/custom_diffusion/requirements.txt pypi

Jinja2 *
accelerate *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/dreambooth/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
peft ==0.7.0
tensorboard *
torchvision *
transformers >=4.25.1

examples/dreambooth/requirements_flax.txt pypi

Jinja2 *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/dreambooth/requirements_sdxl.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
peft ==0.7.0
tensorboard *
torchvision *
transformers >=4.25.1

examples/instruct_pix2pix/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/kandinsky2_2/text_to_image/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/colossalai/requirement.txt pypi

Jinja2 *
diffusers *
ftfy *
tensorboard *
torch *
torchvision *
transformers *

examples/research_projects/consistency_training/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/diffusion_dpo/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
peft *
tensorboard *
torchvision *
transformers >=4.25.1
wandb *

examples/research_projects/dreambooth_inpaint/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
diffusers ==0.9.0
ftfy *
tensorboard *
torchvision *
transformers >=4.21.0

examples/research_projects/intel_opts/textual_inversion/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
intel_extension_for_pytorch >=1.13
tensorboard *
torchvision *
transformers >=4.21.0

examples/research_projects/intel_opts/textual_inversion_dfq/requirements.txt pypi

accelerate *
ftfy *
modelcards *
neural-compressor *
tensorboard *
torchvision *
transformers >=4.25.0

examples/research_projects/lora/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/multi_subject_dreambooth/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/multi_subject_dreambooth_inpainting/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets >=2.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1
wandb >=0.16.1

examples/research_projects/multi_token_textual_inversion/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/multi_token_textual_inversion/requirements_flax.txt pypi

Jinja2 *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/research_projects/onnxruntime/text_to_image/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
modelcards *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/onnxruntime/textual_inversion/requirements.txt pypi

accelerate >=0.16.0
ftfy *
modelcards *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/onnxruntime/unconditional_image_generation/requirements.txt pypi

accelerate >=0.16.0
datasets *
tensorboard *
torchvision *

examples/research_projects/realfill/requirements.txt pypi

Jinja2 ==3.1.3
accelerate ==0.23.0
diffusers ==0.20.1
ftfy ==6.1.1
peft ==0.5.0
tensorboard ==2.14.0
torch ==2.0.1
torchvision >=0.16
transformers ==4.36.0

examples/t2i_adapter/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
safetensors *
tensorboard *
torchvision *
transformers >=4.25.1
wandb *

examples/text_to_image/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets *
ftfy *
peft ==0.7.0
tensorboard *
torchvision *
transformers >=4.25.1

examples/text_to_image/requirements_flax.txt pypi

Jinja2 *
datasets *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/text_to_image/requirements_sdxl.txt pypi

Jinja2 *
accelerate >=0.22.0
datasets *
ftfy *
peft ==0.7.0
tensorboard *
torchvision *
transformers >=4.25.1

examples/textual_inversion/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/textual_inversion/requirements_flax.txt pypi

Jinja2 *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/unconditional_image_generation/requirements.txt pypi

accelerate >=0.16.0
datasets *
torchvision *

examples/wuerstchen/text_to_image/requirements.txt pypi

accelerate >=0.16.0
bitsandbytes *
deepspeed *
peft >=0.6.0
torchvision *
transformers >=4.25.1
wandb *

pyproject.toml pypi

setup.py pypi

deps *

docker/diffusers-flax-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-flax-tpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-onnxruntime-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-onnxruntime-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

docker/diffusers-pytorch-compile-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

docker/diffusers-pytorch-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-pytorch-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

docker/diffusers-pytorch-xformers-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

brushedit

Science Score: 46.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

BrushEdit

TODO

Pipeline Overview

Getting Started

Environment Requirement

Download Checkpoints

Running Scripts

BrushEidt demo

Demo Features

Cite Us

Acknowledgement

Contact

Star History

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies