finetune-sd-inpainting-with-diffuser

Finetune the controlnet+stable diffusion model using diffuser

https://github.com/wuyujack/finetune-sd-inpainting-with-diffuser

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.0%) to scientific vocabulary

Last synced: 11 months ago · JSON representation ·

Repository

Finetune the controlnet+stable diffusion model using diffuser

Basic Info

Host: GitHub
Owner: wuyujack
License: apache-2.0
Language: Python
Default Branch: main
Size: 17.2 MB

Statistics

Stars: 10
Watchers: 1
Forks: 1
Open Issues: 1
Releases: 0

Created almost 3 years ago · Last pushed almost 3 years ago

Metadata Files

Readme Contributing License Code of conduct Citation

Some code I implemented for the course project of CS496 Deep Generative Models. The main increment compared to diffuser is to support finetuning on the controlnet + stable diffusion model for virtual try-on tasks, which including extending the input dimension of the stable diffusion model and fully tune the whole stable diffusion model with controlnet.

The code has not been fully organized, and will not be actively maintained in the future. I release here just want to provide an example to demonstrate how we can adapt the existing Dreambooth inpainting code in diffuser to do finetuning on controlnet + stable diffusion, and how we can develop such a training pipeline with minimal efforts.

As an early exploration, the finetuning results are not good for virtual try-on, where the reasons have demonstrated in our blog post here on Image-guided VITON with diffusion model. Given that this is only a course final project developped in three day, we only focus on promptly validating our idea instead of pursuing for the state-of-the-art results showing in exisitng VITON literature, thus we do not put so much effort on dataset selection, image preprocessing, hyperparameter tunning, and changing the whole methodology and network architecture.

To use the code, please refer to the folder of /example/controlnet/. The training command are the .sh files with run_ in front of the file name, e.g., run_stable_diffusion_controlnet_inpaint.sh.

I use the VITON-HD dataset by defaulted and has done some postprocessing for training, where you can download the post-processed dataset from here.

For the configuration of the environment, please refer to the environment.yml.

Owner

Name: wuyujack (Mingfu Liang)
Login: wuyujack
Kind: user
Location: Evanston
Company: Northwestern University

Website: https://mingfuliang.com/
Repositories: 2
Profile: https://github.com/wuyujack

Ph.D. Candidate at ECE department, Northwestern University.

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'Diffusers: State-of-the-art diffusion models'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Patrick
    family-names: von Platen
  - given-names: Suraj
    family-names: Patil
  - given-names: Anton
    family-names: Lozhkov
  - given-names: Pedro
    family-names: Cuenca
  - given-names: Nathan
    family-names: Lambert
  - given-names: Kashif
    family-names: Rasul
  - given-names: Mishig
    family-names: Davaadorj
  - given-names: Thomas
    family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
abstract: >-
  Diffusers provides pretrained diffusion models across
  multiple modalities, such as vision and audio, and serves
  as a modular toolbox for inference and training of
  diffusion models.
keywords:
  - deep-learning
  - pytorch
  - image-generation
  - diffusion
  - text2image
  - image2image
  - score-based-generative-modeling
  - stable-diffusion
license: Apache-2.0
version: 0.12.1

GitHub Events

Total

Watch event: 5
Fork event: 1

Last Year

Watch event: 5
Fork event: 1

Dependencies

.github/actions/setup-miniconda/action.yml actions

actions/cache v2 composite

.github/workflows/build_docker_images.yml actions

actions/checkout v3 composite
docker/build-push-action v3 composite
docker/login-action v2 composite

.github/workflows/build_documentation.yml actions

.github/workflows/build_pr_documentation.yml actions

.github/workflows/delete_doc_comment.yml actions

.github/workflows/nightly_tests.yml actions

./.github/actions/setup-miniconda * composite
actions/checkout v3 composite
actions/upload-artifact v2 composite

.github/workflows/pr_quality.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/pr_tests.yml actions

actions/checkout v3 composite
actions/upload-artifact v2 composite

.github/workflows/push_tests.yml actions

actions/checkout v3 composite
actions/upload-artifact v2 composite

.github/workflows/push_tests_fast.yml actions

actions/checkout v3 composite
actions/upload-artifact v2 composite

.github/workflows/push_tests_mps.yml actions

./.github/actions/setup-miniconda * composite
actions/checkout v3 composite
actions/upload-artifact v2 composite

.github/workflows/stale.yml actions

actions/checkout v2 composite
actions/setup-python v1 composite

.github/workflows/typos.yml actions

actions/checkout v3 composite
crate-ci/typos v1.12.4 composite

docker/diffusers-flax-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-flax-tpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-onnxruntime-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-onnxruntime-cuda/Dockerfile docker

nvidia/cuda 11.6.2-cudnn8-devel-ubuntu20.04 build

docker/diffusers-pytorch-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-pytorch-cuda/Dockerfile docker

nvidia/cuda 11.7.1-cudnn8-runtime-ubuntu20.04 build

environment.yml pypi

absl-py ==1.4.0
accelerate ==0.19.0
aiohttp ==3.8.4
aiosignal ==1.3.1
async-timeout ==4.0.2
attrs ==23.1.0
bitsandbytes ==0.39.0
cachetools ==5.3.1
cchardet ==2.1.7
chardet ==5.1.0
charset-normalizer ==3.1.0
cmake ==3.26.3
datasets ==2.12.0
diffusers ==0.17.0.dev0
dill ==0.3.6
filelock ==3.12.0
frozenlist ==1.3.3
fsspec ==2023.5.0
ftfy ==6.1.1
google-auth ==2.19.0
google-auth-oauthlib ==1.0.0
grpcio ==1.54.2
huggingface-hub ==0.14.1
importlib-metadata ==6.6.0
lit ==16.0.5
markdown ==3.4.3
markupsafe ==2.1.2
mpmath ==1.3.0
multidict ==6.0.4
multiprocess ==0.70.14
mypy-extensions ==1.0.0
networkx ==3.1
nvidia-cublas-cu11 ==11.10.3.66
nvidia-cuda-cupti-cu11 ==11.7.101
nvidia-cuda-nvrtc-cu11 ==11.7.99
nvidia-cuda-runtime-cu11 ==11.7.99
nvidia-cudnn-cu11 ==8.5.0.96
nvidia-cufft-cu11 ==10.9.0.58
nvidia-curand-cu11 ==10.2.10.91
nvidia-cusolver-cu11 ==11.4.0.1
nvidia-cusparse-cu11 ==11.7.4.91
nvidia-nccl-cu11 ==2.14.3
nvidia-nvtx-cu11 ==11.7.91
oauthlib ==3.2.2
packaging ==23.1
pandas ==2.0.2
pillow ==9.5.0
protobuf ==4.23.2
psutil ==5.9.5
pyarrow ==12.0.0
pyasn1 ==0.5.0
pyasn1-modules ==0.3.0
pyre-extensions ==0.0.29
pytz ==2023.3
pyyaml ==6.0
regex ==2023.5.5
requests ==2.31.0
requests-oauthlib ==1.3.1
responses ==0.18.0
rsa ==4.9
safetensors ==0.3.1
sympy ==1.12
tensorboard ==2.13.0
tensorboard-data-server ==0.7.0
tokenizers ==0.13.3
torch ==2.0.1
torchvision ==0.15.2
tqdm ==4.65.0
transformers ==4.29.2
triton ==2.0.0
typing-extensions ==4.6.2
typing-inspect ==0.9.0
tzdata ==2023.3
urllib3 ==1.26.16
wcwidth ==0.2.6
werkzeug ==2.3.4
xformers ==0.0.20
xxhash ==3.2.0
yarl ==1.9.2
zipp ==3.15.0

examples/controlnet/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/controlnet/requirements_flax.txt pypi

Jinja2 *
datasets *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/custom_diffusion/requirements.txt pypi

Jinja2 *
accelerate *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/dreambooth/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/dreambooth/requirements_flax.txt pypi

Jinja2 *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/instruct_pix2pix/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/colossalai/requirement.txt pypi

Jinja2 *
diffusers *
ftfy *
tensorboard *
torch *
torchvision *
transformers *

examples/research_projects/dreambooth_inpaint/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
diffusers ==0.9.0
ftfy *
tensorboard *
torchvision *
transformers >=4.21.0

examples/research_projects/intel_opts/textual_inversion/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
intel_extension_for_pytorch >=1.13
tensorboard *
torchvision *
transformers >=4.21.0

examples/research_projects/intel_opts/textual_inversion_dfq/requirements.txt pypi

accelerate *
ftfy *
modelcards *
neural-compressor *
tensorboard *
torchvision *
transformers >=4.25.0

examples/research_projects/lora/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/mulit_token_textual_inversion/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/mulit_token_textual_inversion/requirements_flax.txt pypi

Jinja2 *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/research_projects/multi_subject_dreambooth/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/onnxruntime/text_to_image/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
modelcards *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/onnxruntime/textual_inversion/requirements.txt pypi

accelerate >=0.16.0
ftfy *
modelcards *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/onnxruntime/unconditional_image_generation/requirements.txt pypi

accelerate >=0.16.0
datasets *
tensorboard *
torchvision *

examples/text_to_image/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/text_to_image/requirements_flax.txt pypi

Jinja2 *
datasets *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/textual_inversion/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/textual_inversion/requirements_flax.txt pypi

Jinja2 *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/unconditional_image_generation/requirements.txt pypi

accelerate >=0.16.0
datasets *
torchvision *

pyproject.toml pypi

setup.py pypi

deps *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science