mdsd3

https://github.com/pinht126/mdsd3

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.3%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: pinht126
License: apache-2.0
Language: Python
Default Branch: main
Size: 3.71 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme Contributing License Code of conduct Citation

Difference between Stable Diffusion v3

Applying the Masking Algorithm to Stable Diffusion v3

We apply the head-wise masking technique to Stable Diffusion v3, aiming to preserve its generation performance while enabling the creation of complex degraded images through the masking mechanism.

Applying Clean Image Condition to Preserve Scene Image

We duplicate the part that receives the input image and insert clean image information by summing it with the output of a zero-convolution layer, allowing the model to preserve the original content. However, we remove the modulation mechanism of AdaLN-Zero from the clean image input path, as the clean image information does not need to be influenced by the class conditioning.

Results

Haze and rain both have the characteristic of increasing overall brightness. When both conditions are applied simultaneously, this can lead to the rain effect becoming overly blurred.

Modify the masking ratio to a (float) value other than 1 and 0 when generating multi-degradation images.

Adjust the Initial Noise using the noise equation!

noise equation : 𝑙𝑎𝑡𝑒𝑛𝑡𝑠=𝛼∗𝑛𝑜𝑖𝑠𝑒+𝛽∗𝑖𝑛𝑝𝑢𝑡 (𝛼+𝛽=1)

As noise decreases, the degradation condition becomes weaker. Therefore, we want the sum of alpha and beta to be greater than 1. Initail noise equation : 𝑙𝑎𝑡𝑒𝑛𝑡𝑠=𝛼∗𝑛𝑜𝑖𝑠𝑒+𝛽∗𝑖𝑛𝑝𝑢𝑡 (𝛼+𝛽>1)

Results

Problem: The generated results tend to preserve the overall color structure of the initial input image.

Frequency-Domain Analysis

Step-wise generation results are presented for the haze, rain, and haze&rain classes.

It is observed that haze, being a low-frequency degradation, is generated in the early steps of the diffusion model, whereas rain, which has high-frequency characteristics, is generated in the later steps.

[Boosting diffusion models with moving average sampling in frequency domain Qian et al, CVPR 2024]

Qian et al. stated that “Diffusion models at the denoising process first focus on the recovery of low-frequency components in the earlier timesteps and gradually shift to recovering high-frequency details in the later timesteps.”

-> So, Degradation-specific details(rain) should be generated in the later stages of the denoising process.

Applying the focal-frequency loss to incorporate frequency-domain

[Jiang, Liming, et al. "Focal frequency loss for image reconstruction and synthesis." Proceedings of the IEEE/CVF international conference on computer vision. 2021.]

L jiang et al. use a frequency-domain loss instead of pixel-based loss when training GANs or VAEs to better learn high-frequency details.

We train the model to learn the degradation details(high-frequency).

Since frequency components become more important in the later stages of the backward process (i.e., at smaller timesteps), we multiply the focal-frequency loss by a weighting factor of (1 - T / 1000) to assign greater importance when T is small.

results

Artificial noise is suppressed, resulting in the effective generation of images containing a mixture of rain and haze degradations.

It shows visually effective results in specific style mixing scenarios.

Installation

We recommend installing 🤗 Diffusers in a virtual environment from PyPI or Conda. For more details about installing PyTorch and Flax, please refer to their official documentation.

PyTorch

With pip (official package):

bash pip install --upgrade diffusers[torch]

With conda (maintained by the community):

sh conda install -c conda-forge diffusers

Flax

With pip (official package):

bash pip install --upgrade diffusers[flax]

Apple Silicon (M1/M2) support

Please refer to the How to use Stable Diffusion in Apple Silicon guide.

Quickstart

Generating outputs is super easy with 🤗 Diffusers. To generate an image from text, use the from_pretrained method to load any pretrained diffusion model (browse the Hub for 30,000+ checkpoints):

```python from diffusers import DiffusionPipeline import torch

pipeline = DiffusionPipeline.frompretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", torchdtype=torch.float16) pipeline.to("cuda") pipeline("An image of a squirrel in Picasso style").images[0] ```

You can also dig into the models and schedulers toolbox to build your own diffusion system:

```python from diffusers import DDPMScheduler, UNet2DModel from PIL import Image import torch

scheduler = DDPMScheduler.frompretrained("google/ddpm-cat-256") model = UNet2DModel.frompretrained("google/ddpm-cat-256").to("cuda") scheduler.set_timesteps(50)

samplesize = model.config.samplesize noise = torch.randn((1, 3, samplesize, samplesize), device="cuda") input = noise

for t in scheduler.timesteps: with torch.nograd(): noisyresidual = model(input, t).sample prevnoisysample = scheduler.step(noisyresidual, t, input).prevsample input = prevnoisysample

image = (input / 2 + 0.5).clamp(0, 1) image = image.cpu().permute(0, 2, 3, 1).numpy()[0] image = Image.fromarray((image * 255).round().astype("uint8")) image ```

Check out the Quickstart to launch your diffusion journey today!

MDSD3

Owner

Name: Jiwon Heo
Login: pinht126
Kind: user

Repositories: 1
Profile: https://github.com/pinht126

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'Diffusers: State-of-the-art diffusion models'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Patrick
    family-names: von Platen
  - given-names: Suraj
    family-names: Patil
  - given-names: Anton
    family-names: Lozhkov
  - given-names: Pedro
    family-names: Cuenca
  - given-names: Nathan
    family-names: Lambert
  - given-names: Kashif
    family-names: Rasul
  - given-names: Mishig
    family-names: Davaadorj
  - given-names: Dhruv
    family-names: Nair
  - given-names: Sayak
    family-names: Paul
  - given-names: Steven
    family-names: Liu
  - given-names: William
    family-names: Berman
  - given-names: Yiyi
    family-names: Xu
  - given-names: Thomas
    family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
abstract: >-
  Diffusers provides pretrained diffusion models across
  multiple modalities, such as vision and audio, and serves
  as a modular toolbox for inference and training of
  diffusion models.
keywords:
  - deep-learning
  - pytorch
  - image-generation
  - hacktoberfest
  - diffusion
  - text2image
  - image2image
  - score-based-generative-modeling
  - stable-diffusion
  - stable-diffusion-diffusers
license: Apache-2.0
version: 0.12.1

GitHub Events

Total

Watch event: 1
Push event: 5

Last Year

Watch event: 1
Push event: 5

Dependencies

docker/diffusers-doc-builder/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-flax-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-flax-tpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-onnxruntime-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-onnxruntime-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

docker/diffusers-pytorch-compile-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

docker/diffusers-pytorch-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-pytorch-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

docker/diffusers-pytorch-xformers-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

pyproject.toml pypi

setup.py pypi

deps *

src/diffusers.egg-info/requires.txt pypi

GitPython <3.1.19
Jinja2 *
Pillow *
accelerate >=0.31.0
compel ==0.1.8
datasets *
filelock *
flax >=0.4.1
hf-doc-builder >=0.3.0
huggingface-hub >=0.23.2
importlib_metadata *
invisible-watermark >=0.2.0
isort >=5.5.4
jax >=0.4.1
jaxlib >=0.4.1
k-diffusion >=0.0.12
librosa *
numpy *
parameterized *
peft >=0.6.0
protobuf <4,>=3.20.3
pytest *
pytest-timeout *
pytest-xdist *
regex *
requests *
requests-mock ==1.10.0
ruff ==0.1.5
safetensors >=0.3.1
scipy *
sentencepiece *
tensorboard *
torch >=1.4
torchvision *
transformers >=4.41.2
urllib3 <=2.0.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

mdsd3

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Difference between Stable Diffusion v3

Applying the Masking Algorithm to Stable Diffusion v3

Applying Clean Image Condition to Preserve Scene Image

Results

Modify the masking ratio to a (float) value other than 1 and 0 when generating multi-degradation images.

Adjust the Initial Noise using the noise equation!

Results

Frequency-Domain Analysis

Applying the focal-frequency loss to incorporate frequency-domain

results

Installation

PyTorch

Flax

Apple Silicon (M1/M2) support

Quickstart

MDSD3

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies