diffusers

diffusers 0.33.1 for TPU t2v i2v migration

https://github.com/shungcp/diffusers

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (2.2%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

diffusers 0.33.1 for TPU t2v i2v migration

Basic Info

Host: GitHub
Owner: shungcp
License: apache-2.0
Language: Python
Default Branch: main
Size: 7.7 MB

Statistics

Stars: 0
Watchers: 0
Forks: 6
Open Issues: 1
Releases: 0

Created about 1 year ago · Last pushed 12 months ago

Metadata Files

Readme Contributing License Code of conduct Citation

README.md

Original readme moved to README_original.md

install

Install dependencies, setup virtual env first if required.

sh sh -ex setup-dep.sh

To run:

python wan_tx_splash_attn.py

Progress:

(Jun 17) ┏━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Device ┃ Memory usage ┃ Duty cycle ┃ ┡━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ 0 │ 2.09 GiB / 31.25 GiB │ 0.00% │ │ 1 │ 2.08 GiB / 31.25 GiB │ 0.00% │ │ 2 │ 2.08 GiB / 31.25 GiB │ 0.00% │ │ 3 │ 2.08 GiB / 31.25 GiB │ 0.00% │ │ 4 │ 2.08 GiB / 31.25 GiB │ 0.00% │ │ 5 │ 2.08 GiB / 31.25 GiB │ 0.00% │ │ 6 │ 2.08 GiB / 31.25 GiB │ 0.00% │ │ 7 │ 2.08 GiB / 31.25 GiB │ 0.00% │

sizes:

wan 1.3B:

text_encoder 12.537574768066406 G transformer 2.64891254901886 G vae 0.23635575734078884 G

wan 14B

text_encoder 12.537574768066406 G transformer 26.66874897480011 G vae 0.23635575734078884 G

Shapes of weights for 1.3B model:

vae

encoder.conv_in.weight : # (torch.Size([96, 3, 3, 3, 3]), torch.bfloat16) encoder.conv_in.bias : # (torch.Size([96]), torch.bfloat16) encoder.down_blocks.*.norm*.gamma : # (torch.Size([384, 1, 1, 1]), torch.bfloat16) encoder.down_blocks.*.conv*.weight : # (torch.Size([384, 384, 3, 3, 3]), torch.bfloat16) encoder.down_blocks.*.conv*.bias : # (torch.Size([384]), torch.bfloat16) encoder.down_blocks.*.resample.*.weight : # (torch.Size([384, 384, 3, 3]), torch.bfloat16) encoder.down_blocks.*.resample.*.bias : # (torch.Size([384]), torch.bfloat16) encoder.down_blocks.*.conv_shortcut.weight : # (torch.Size([384, 192, 1, 1, 1]), torch.bfloat16) encoder.down_blocks.*.conv_shortcut.bias : # (torch.Size([384]), torch.bfloat16) encoder.down_blocks.*.time_conv.weight : # (torch.Size([384, 384, 3, 1, 1]), torch.bfloat16) encoder.down_blocks.*.time_conv.bias : # (torch.Size([384]), torch.bfloat16) encoder.mid_block.attentions.*.norm.gamma : # (torch.Size([384, 1, 1]), torch.bfloat16) encoder.mid_block.attentions.*.to_qkv.weight : # (torch.Size([1152, 384, 1, 1]), torch.bfloat16) encoder.mid_block.attentions.*.to_qkv.bias : # (torch.Size([1152]), torch.bfloat16) encoder.mid_block.attentions.*.proj.weight : # (torch.Size([384, 384, 1, 1]), torch.bfloat16) encoder.mid_block.attentions.*.proj.bias : # (torch.Size([384]), torch.bfloat16) encoder.mid_block.resnets.*.norm*.gamma : # (torch.Size([384, 1, 1, 1]), torch.bfloat16) encoder.mid_block.resnets.*.conv*.weight : # (torch.Size([384, 384, 3, 3, 3]), torch.bfloat16) encoder.mid_block.resnets.*.conv*.bias : # (torch.Size([384]), torch.bfloat16) encoder.norm_out.gamma : # (torch.Size([384, 1, 1, 1]), torch.bfloat16) encoder.conv_out.weight : # (torch.Size([32, 384, 3, 3, 3]), torch.bfloat16) encoder.conv_out.bias : # (torch.Size([32]), torch.bfloat16) quant_conv.weight : # (torch.Size([32, 32, 1, 1, 1]), torch.bfloat16) quant_conv.bias : # (torch.Size([32]), torch.bfloat16) post_quant_conv.weight : # (torch.Size([16, 16, 1, 1, 1]), torch.bfloat16) post_quant_conv.bias : # (torch.Size([16]), torch.bfloat16) decoder.conv_in.weight : # (torch.Size([384, 16, 3, 3, 3]), torch.bfloat16) decoder.conv_in.bias : # (torch.Size([384]), torch.bfloat16) decoder.mid_block.attentions.*.norm.gamma : # (torch.Size([384, 1, 1]), torch.bfloat16) decoder.mid_block.attentions.*.to_qkv.weight : # (torch.Size([1152, 384, 1, 1]), torch.bfloat16) decoder.mid_block.attentions.*.to_qkv.bias : # (torch.Size([1152]), torch.bfloat16) decoder.mid_block.attentions.*.proj.weight : # (torch.Size([384, 384, 1, 1]), torch.bfloat16) decoder.mid_block.attentions.*.proj.bias : # (torch.Size([384]), torch.bfloat16) decoder.mid_block.resnets.*.norm*.gamma : # (torch.Size([384, 1, 1, 1]), torch.bfloat16) decoder.mid_block.resnets.*.conv*.weight : # (torch.Size([384, 384, 3, 3, 3]), torch.bfloat16) decoder.mid_block.resnets.*.conv*.bias : # (torch.Size([384]), torch.bfloat16) decoder.up_blocks.*.resnets.*.norm*.gamma : # (torch.Size([96, 1, 1, 1]), torch.bfloat16) decoder.up_blocks.*.resnets.*.conv*.weight : # (torch.Size([96, 96, 3, 3, 3]), torch.bfloat16) decoder.up_blocks.*.resnets.*.conv*.bias : # (torch.Size([96]), torch.bfloat16) decoder.up_blocks.*.upsamplers.*.resample.*.weight : # (torch.Size([96, 192, 3, 3]), torch.bfloat16) decoder.up_blocks.*.upsamplers.*.resample.*.bias : # (torch.Size([96]), torch.bfloat16) decoder.up_blocks.*.upsamplers.*.time_conv.weight : # (torch.Size([768, 384, 3, 1, 1]), torch.bfloat16) decoder.up_blocks.*.upsamplers.*.time_conv.bias : # (torch.Size([768]), torch.bfloat16) decoder.up_blocks.*.resnets.*.conv_shortcut.weight : # (torch.Size([384, 192, 1, 1, 1]), torch.bfloat16) decoder.up_blocks.*.resnets.*.conv_shortcut.bias : # (torch.Size([384]), torch.bfloat16) decoder.norm_out.gamma : # (torch.Size([96, 1, 1, 1]), torch.bfloat16) decoder.conv_out.weight : # (torch.Size([3, 96, 3, 3, 3]), torch.bfloat16) decoder.conv_out.bias : # (torch.Size([3]), torch.bfloat16)

transformer

scale_shift_table : # (torch.Size([1, 2, 1536]), torch.float32) patch_embedding.weight : # (torch.Size([1536, 16, 1, 2, 2]), torch.bfloat16) patch_embedding.bias : # (torch.Size([1536]), torch.bfloat16) condition_embedder.time_embedder.linear_*.weight : # (torch.Size([1536, 1536]), torch.float32) condition_embedder.time_embedder.linear_*.bias : # (torch.Size([1536]), torch.float32) condition_embedder.time_proj.weight : # (torch.Size([9216, 1536]), torch.bfloat16) condition_embedder.time_proj.bias : # (torch.Size([9216]), torch.bfloat16) condition_embedder.text_embedder.linear_*.weight : # (torch.Size([1536, 1536]), torch.bfloat16) condition_embedder.text_embedder.linear_*.bias : # (torch.Size([1536]), torch.bfloat16) blocks.*.scale_shift_table : # (torch.Size([1, 6, 1536]), torch.float32) blocks.*.attn*.norm_q.weight : # (torch.Size([1536]), torch.bfloat16) blocks.*.attn*.norm_k.weight : # (torch.Size([1536]), torch.bfloat16) blocks.*.attn*.to_q.weight : # (torch.Size([1536, 1536]), torch.bfloat16) blocks.*.attn*.to_q.bias : # (torch.Size([1536]), torch.bfloat16) blocks.*.attn*.to_k.weight : # (torch.Size([1536, 1536]), torch.bfloat16) blocks.*.attn*.to_k.bias : # (torch.Size([1536]), torch.bfloat16) blocks.*.attn*.to_v.weight : # (torch.Size([1536, 1536]), torch.bfloat16) blocks.*.attn*.to_v.bias : # (torch.Size([1536]), torch.bfloat16) blocks.*.attn*.to_out.*.weight : # (torch.Size([1536, 1536]), torch.bfloat16) blocks.*.attn*.to_out.*.bias : # (torch.Size([1536]), torch.bfloat16) blocks.*.norm*.weight : # (torch.Size([1536]), torch.float32) blocks.*.norm*.bias : # (torch.Size([1536]), torch.float32) blocks.*.ffn.net.*.proj.weight : # (torch.Size([8960, 1536]), torch.bfloat16) blocks.*.ffn.net.*.proj.bias : # (torch.Size([8960]), torch.bfloat16) blocks.*.ffn.net.*.weight : # (torch.Size([1536, 8960]), torch.bfloat16) blocks.*.ffn.net.*.bias : # (torch.Size([1536]), torch.bfloat16) proj_out.weight : # (torch.Size([64, 1536]), torch.bfloat16) proj_out.bias : # (torch.Size([64]), torch.bfloat16)

text encoder

shared.weight : # (torch.Size([256384, 4096]), torch.bfloat16) encoder.block.*.layer.*.SelfAttention.q.weight : # (torch.Size([4096, 4096]), torch.bfloat16) encoder.block.*.layer.*.SelfAttention.k.weight : # (torch.Size([4096, 4096]), torch.bfloat16) encoder.block.*.layer.*.SelfAttention.v.weight : # (torch.Size([4096, 4096]), torch.bfloat16) encoder.block.*.layer.*.SelfAttention.o.weight : # (torch.Size([4096, 4096]), torch.bfloat16) encoder.block.*.layer.*.SelfAttention.relative_attention_bias.weight : # (torch.Size([32, 64]), torch.bfloat16) encoder.block.*.layer.*.layer_norm.weight : # (torch.Size([4096]), torch.bfloat16) encoder.block.*.layer.*.DenseReluDense.wi_*.weight : # (torch.Size([10240, 4096]), torch.bfloat16) encoder.block.*.layer.*.DenseReluDense.wo.weight : # (torch.Size([4096, 10240]), torch.bfloat16) encoder.final_layer_norm.weight : # (torch.Size([4096]), torch.bfloat16)

adding flash attention inconsistant issue unit test

``` python comparefasharding_consistency.py

...

  FINAL CONCLUSION

==============================

...

```

wantxsplash_attn.py combine the jax pallas splash attention, maxdiffusion vae decoder

```

on v6e-8:

(venv)$ python wantxsplashattn.py Load and port Wan 2.1 VAE on tpu Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████| 12/12 [00:01<00:00, 10.11it/s] Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████| 5/5 [00:01<00:00, 4.60it/s] Loading pipeline components...: 100%|███████████████████████████████████████████████████████████| 5/5 [00:03<00:00, 1.63it/s] `losstype=Nonewas set in the config but it is unrecognised.Using the default loss:ForCausalLMLoss. Number of devices is:, 8 text_encoder 12.537574768066406 G transformer 26.66874897480011 G vae (JAX VAE) - size calculation not implemented return lax_numpy.astype(self, dtype, copy=copy, device=device) 100%|█████████████████████████████████████████████████████████████████████████████████████████| 50/50 [08:16<00:00, 9.94s/it] numpy shape: (720, 1280, 3, 81) 100%|█████████████████████████████████████████████████████████████████████████████████████████| 50/50 [06:29<00:00, 7.80s/it] Iteration 0: 418.294946s DONE``

support flash attention

Current support flash attention to generate correct normal 14B model, 81 frames videos. Flash attention prevent the huge attention weight which cause OOM.

1.3B model is not yet ready using flash attention since kvhead = 12 cannot divide by 8 tpus. Disable flash attention for VAE for now since kvhead = 1 in VAE.

Modify flash attention block size to 2048 528s

multi-host run on v6e-16

create tpu vm with v6e-16.
1. it will create 4 hosts with 4x4 gpus mesh
2. all the command use gcloud to distribute to all workers.

```

Remember to replace variable in placeholder

setup env

export PROJECTID=<projectid> export TPUNAME=<tpuname> export ZONE= export ACCELERATORTYPE=v6e-16 export RUNTIMEVERSION=v2-alpha-tpuv6e

export ACCOUNT= export GITHUBBRANCH=<branchname> export GITHUBADDRESS=<githubrepo_address>

run() { local command=$1 local worker=${2:-all} gcloud compute tpus tpu-vm ssh --zone "${ZONE}" "${ACCOUNT}@${TPUNAME}" --project "${PROJECTID}" --worker=${worker} --command="$command" }

SETUPCOMMAND="\ set -x && \ sudo apt update && \ sudo apt install -y python3.10-venv && \ python -m venv venv && \ source venv/bin/activate && \ git clone -b ${GITHUBBRANCH} ${GITHUB_ADDRESS} || true && \ cd diffusers && \ sh -ex setup-dep.sh \ "

Only need run the first time

run "${SETUP_COMMAND}"

RUNCOMMAND="\ set -x && \ source ~/venv/bin/activate && \ killall -9 python || true && \ sleep 10 && \ export JAXCOMPILATIONCACHEDIR="/dev/shm/jaxcache" && \ export JAXPERSISTENTCACHEMINENTRYSIZEBYTES=-1 && \ export JAXPERSISTENTCACHEMINCOMPILETIMESECS=0 && \ export JAXPERSISTENTCACHEENABLEXLACACHES='xlagpuperfusionautotunecachedir' && \ export HFHUBCACHE=/dev/shm/hfcache && \ cd diffusers && \ git fetch && git reset --hard origin/${GITHUBBRANCH} && \ nohup python wantx.py > wantx.log 2>&1 & \ " run "${RUN_COMMAND}"

``` ssh into a VM to collect the log in wan_tx.log and video generated.

Add DP support

v6e-16 need use DP to divide head_dim=40 .

test using flash attention:
* v6e-8 with dp=2, tp=4:
* 528s -> 490s
* v6e-16 with dp=2, tp=8: * 358s

With wantxsplash_attn: Do not support DP on v6e-8 for now. The VAE will OOM. * v6e-16 with dp=2, tp=8: * 257s

Add SP support

test using flash attention wan_tx:
* v6e-8 with dp=1, tp=4, sp=2:
* 519s
* v6e-8 with dp=2, tp=2, sp=2: * VAE OOM * v6e-16 with dp=2, tp=4, sp=2: * 319s

test with wantxsplash_attn: * v6e-16 with dp=2, tp=4, sp=2: * VAE OOM

Modify maxdiffusion to reduce memory usage

To utilize sp with maxdiffusion vae, need to reduce the peak memory usage.
Modification is in https://github.com/yuyanpeng-google/maxdiffusion/tree/wan2.1-dev.

with wantxsplash_attn.py * v6e-8 with dp=2, sp=1, tp=4: * 397s * v6e-16 with dp=2, sp=2, tp=4: * 215s

=======

Add VAE sharding

v6e-8, dp=2, sp=1, tp=4
- python wantxsplashattn.py --usedp --sp_num=1
- 100%|█████████████| 50/50 [06:06<00:00, 7.32s/it]
- Iteration 0: 376.032197s
v6e-16, dp=2, sp=2, tp=4
- python wantxsplashattn.py --usedp --sp_num=2
- 100%|██████████| 50/50 [03:16<00:00, 3.93s/it]
- Iteration 0: 205.504582s
v6e-32, dp=2, sp=2, tp=8
- python wantxsplashattn.py --usedp --sp_num=2
- 100%|██████████| 50/50 [02:06<00:00, 2.53s/it]
- Iteration 0: 134.893512s

VAE is consuming about 10s now.

Tune default and add bqsize, bkvsize as arguments

v6e-16, dp=2, sp=2, tp=4
- python wantxsplashattn.py --usedp --sp_num=2 --bqsize 1512 --bkvsize 1024
- 100%|██████████| 50/50 [03:07<00:00, 3.76s/it]
- Iteration 0: BKVSIZE=1024, BQSIZE=1512: 196.695673s

Adjust sharding using FSDP on sequence and remesh to head on self attn, and still sp on cross attn

To prevent all reduce on long sequence. The optimal block size may change. Not sweep the block size yet.

v6e-16, dp=2, sp=1, tp=8
- python wantxsplashattn.py --usedp --sp_num=1 --bqsize 1512 --bkvsize 1024
- 100%|██████████| 50/50 [02:55<00:00, 3.50s/it]
- Iteration 0 BKVSIZE=1024, BQSIZE=1512: 184.074076s

Sweep best block size and use TP again

Use TP faster 1.5s than FSDP on v6e-16. Use parameter --use_fsdp to use the FSDP.

v6e-16 With best block size:
- 100%|██████████| 50/50 [02:42<00:00, 3.24s/it]
- Iteration 0 BKVCOMPUTESIZE=1024 BKVSIZE=2048, BQSIZE=3024: 171.043314s

Fix not sharding value of attention correctly

python wan_tx_splash_attn.py --use_k_smooth=False * v6e-16 * 100%|██████████| 50/50 [02:13<00:00, 2.68s/it] * Iteration 0 BKVCOMPUTESIZE=1024 BKVSIZE=2048, BQSIZE=3024: $${\color{red}143.149646s}$$

Owner

Login: shungcp
Kind: user
Location: Beijing
Company: Google Cloud

Website: https://github.com/shungcp
Repositories: 1
Profile: https://github.com/shungcp

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'Diffusers: State-of-the-art diffusion models'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Patrick
    family-names: von Platen
  - given-names: Suraj
    family-names: Patil
  - given-names: Anton
    family-names: Lozhkov
  - given-names: Pedro
    family-names: Cuenca
  - given-names: Nathan
    family-names: Lambert
  - given-names: Kashif
    family-names: Rasul
  - given-names: Mishig
    family-names: Davaadorj
  - given-names: Dhruv
    family-names: Nair
  - given-names: Sayak
    family-names: Paul
  - given-names: Steven
    family-names: Liu
  - given-names: William
    family-names: Berman
  - given-names: Yiyi
    family-names: Xu
  - given-names: Thomas
    family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
abstract: >-
  Diffusers provides pretrained diffusion models across
  multiple modalities, such as vision and audio, and serves
  as a modular toolbox for inference and training of
  diffusion models.
keywords:
  - deep-learning
  - pytorch
  - image-generation
  - hacktoberfest
  - diffusion
  - text2image
  - image2image
  - score-based-generative-modeling
  - stable-diffusion
  - stable-diffusion-diffusers
license: Apache-2.0
version: 0.12.1

GitHub Events

Total

Member event: 2
Issue comment event: 1
Push event: 33
Pull request event: 31
Fork event: 4
Create event: 2

Last Year

Member event: 2
Issue comment event: 1
Push event: 33
Pull request event: 31
Fork event: 4
Create event: 2

Dependencies

.github/actions/setup-miniconda/action.yml actions

actions/cache v2 composite

.github/workflows/benchmark.yml actions

actions/checkout v3 composite
actions/upload-artifact v4 composite

.github/workflows/build_docker_images.yml actions

actions/checkout v3 composite
docker/build-push-action v3 composite
docker/login-action v2 composite
docker/setup-buildx-action v1 composite
huggingface/hf-workflows/.github/actions/post-slack main composite
jitterbit/get-changed-files v1 composite

.github/workflows/build_documentation.yml actions

.github/workflows/build_pr_documentation.yml actions

.github/workflows/mirror_community_pipeline.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/nightly_tests.yml actions

actions/checkout v3 composite
actions/upload-artifact v4 composite

.github/workflows/notify_slack_about_release.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/pr_dependency_test.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/pr_flax_dependency_test.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/pr_style_bot.yml actions

.github/workflows/pr_test_fetcher.yml actions

actions/checkout v3 composite
actions/upload-artifact v3 composite
actions/upload-artifact v4 composite

.github/workflows/pr_tests.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite
actions/upload-artifact v4 composite

.github/workflows/pr_tests_gpu.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite
actions/upload-artifact v4 composite

.github/workflows/pr_torch_dependency_test.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/push_tests.yml actions

actions/checkout v3 composite
actions/upload-artifact v4 composite

.github/workflows/push_tests_fast.yml actions

actions/checkout v3 composite
actions/upload-artifact v4 composite

.github/workflows/push_tests_mps.yml actions

./.github/actions/setup-miniconda * composite
actions/checkout v3 composite
actions/upload-artifact v4 composite

.github/workflows/pypi_publish.yaml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/release_tests_fast.yml actions

actions/checkout v3 composite
actions/upload-artifact v4 composite

.github/workflows/run_tests_from_a_pr.yml actions

actions/checkout v4 composite

.github/workflows/ssh-pr-runner.yml actions

actions/checkout v3 composite
huggingface/tailscale-action main composite

.github/workflows/ssh-runner.yml actions

actions/checkout v3 composite
huggingface/tailscale-action main composite

.github/workflows/stale.yml actions

actions/checkout v2 composite
actions/setup-python v1 composite

.github/workflows/trufflehog.yml actions

actions/checkout v4 composite
trufflesecurity/trufflehog main composite

.github/workflows/typos.yml actions

actions/checkout v3 composite
crate-ci/typos v1.12.4 composite

.github/workflows/update_metadata.yml actions

actions/checkout v3 composite

.github/workflows/upload_pr_documentation.yml actions

docker/diffusers-doc-builder/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-flax-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-flax-tpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-onnxruntime-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-onnxruntime-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

docker/diffusers-pytorch-compile-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

docker/diffusers-pytorch-cpu/Dockerfile docker

ubuntu 20.04 build

docker/diffusers-pytorch-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

docker/diffusers-pytorch-minimum-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

docker/diffusers-pytorch-xformers-cuda/Dockerfile docker

nvidia/cuda 12.1.0-runtime-ubuntu20.04 build

examples/advanced_diffusion_training/requirements.txt pypi

Jinja2 *
accelerate >=0.31.0
ftfy *
peft >=0.11.1
sentencepiece *
tensorboard *
torchvision *
transformers >=4.41.2

examples/advanced_diffusion_training/requirements_flux.txt pypi

Jinja2 *
accelerate >=0.31.0
ftfy *
peft >=0.11.1
sentencepiece *
tensorboard *
torchvision *
transformers >=4.41.2

examples/cogvideo/requirements.txt pypi

Jinja2 *
accelerate >=0.31.0
decord >=0.6.0
ftfy *
imageio-ffmpeg *
peft >=0.11.1
sentencepiece *
tensorboard *
torchvision *
transformers >=4.41.2

examples/cogview4-control/requirements.txt pypi

accelerate ==1.2.0
peft >=0.14.0
torch *
torchvision *
transformers ==4.47.0
wandb *

examples/consistency_distillation/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1
webdataset *

examples/controlnet/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/controlnet/requirements_flax.txt pypi

Jinja2 *
datasets *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/controlnet/requirements_flux.txt pypi

Jinja2 *
SentencePiece *
accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1
wandb *

examples/controlnet/requirements_sd3.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1
wandb *

examples/controlnet/requirements_sdxl.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1
wandb *

examples/custom_diffusion/requirements.txt pypi

Jinja2 *
accelerate *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/dreambooth/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
peft ==0.7.0
tensorboard *
torchvision *
transformers >=4.25.1

examples/dreambooth/requirements_flax.txt pypi

Jinja2 *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/dreambooth/requirements_flux.txt pypi

Jinja2 *
accelerate >=0.31.0
ftfy *
peft >=0.11.1
sentencepiece *
tensorboard *
torchvision *
transformers >=4.41.2

examples/dreambooth/requirements_sana.txt pypi

Jinja2 *
accelerate >=1.0.0
ftfy *
peft >=0.14.0
sentencepiece *
tensorboard *
torchvision *
transformers >=4.47.0

examples/dreambooth/requirements_sd3.txt pypi

Jinja2 *
accelerate >=0.31.0
ftfy *
peft ==0.11.1
sentencepiece *
tensorboard *
torchvision *
transformers >=4.41.2

examples/dreambooth/requirements_sdxl.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
peft ==0.7.0
tensorboard *
torchvision *
transformers >=4.25.1

examples/flux-control/requirements.txt pypi

accelerate ==1.2.0
peft >=0.14.0
torch *
torchvision *
transformers ==4.47.0
wandb *

examples/instruct_pix2pix/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/kandinsky2_2/text_to_image/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/model_search/requirements.txt pypi

huggingface-hub >=0.26.2

examples/research_projects/autoencoderkl/requirements.txt pypi

Pillow *
accelerate >=0.16.0
bitsandbytes *
datasets *
huggingface_hub *
lpips *
numpy *
packaging *
taming_transformers *
torch *
torchvision *
tqdm *
transformers *
wandb *
xformers *

examples/research_projects/colossalai/requirement.txt pypi

Jinja2 *
diffusers *
ftfy *
tensorboard *
torch *
torchvision *
transformers *

examples/research_projects/consistency_training/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/diffusion_dpo/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
peft *
tensorboard *
torchvision *
transformers >=4.25.1
wandb *

examples/research_projects/diffusion_orpo/requirements.txt pypi

accelerate *
datasets *
peft *
torchvision *
transformers *
wandb *
webdataset *

examples/research_projects/dreambooth_inpaint/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
diffusers ==0.9.0
ftfy *
tensorboard *
torchvision *
transformers >=4.21.0

examples/research_projects/gligen/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
diffusers *
fairscale *
ftfy *
scipy *
tensorboard *
timm *
torchvision *
transformers >=4.25.1
wandb *

examples/research_projects/intel_opts/textual_inversion/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
intel_extension_for_pytorch >=1.13
tensorboard *
torchvision *
transformers >=4.21.0

examples/research_projects/intel_opts/textual_inversion_dfq/requirements.txt pypi

accelerate *
ftfy *
modelcards *
neural-compressor *
tensorboard *
torchvision *
transformers >=4.25.0

examples/research_projects/ip_adapter/requirements.txt pypi

accelerate *
ip_adapter *
torchvision *
transformers >=4.25.1

examples/research_projects/lora/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets *
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/multi_subject_dreambooth/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/multi_subject_dreambooth_inpainting/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets >=2.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1
wandb >=0.16.1

examples/research_projects/multi_token_textual_inversion/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/multi_token_textual_inversion/requirements_flax.txt pypi

Jinja2 *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/research_projects/onnxruntime/text_to_image/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
modelcards *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/onnxruntime/textual_inversion/requirements.txt pypi

accelerate >=0.16.0
ftfy *
modelcards *
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/onnxruntime/unconditional_image_generation/requirements.txt pypi

accelerate >=0.16.0
datasets *
tensorboard *
torchvision *

examples/research_projects/pixart/requirements.txt pypi

SentencePiece *
controlnet-aux *
datasets *
torchvision *
transformers *

examples/research_projects/pytorch_xla/training/text_to_image/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets >=2.19.1
ftfy *
peft ==0.7.0
tensorboard *
torchvision *
transformers >=4.25.1

examples/research_projects/realfill/requirements.txt pypi

Jinja2 ==3.1.6
accelerate ==0.23.0
diffusers ==0.20.1
ftfy ==6.1.1
peft ==0.5.0
tensorboard ==2.14.0
torch ==2.2.0
torchvision >=0.16
transformers ==4.38.0

examples/research_projects/wuerstchen/text_to_image/requirements.txt pypi

accelerate >=0.16.0
bitsandbytes *
deepspeed *
peft >=0.6.0
torchvision *
transformers >=4.25.1
wandb *

examples/server/requirements.in pypi

aiohttp *
fastapi *
prometheus-fastapi-instrumentator >=7.0.0
prometheus_client >=0.18.0
py-consul *
sentencepiece *
torch *
transformers ==4.46.1
uvicorn *

examples/server/requirements.txt pypi

aiohappyeyeballs ==2.4.3
aiohttp ==3.10.10
aiosignal ==1.3.1
annotated-types ==0.7.0
anyio ==4.6.2.post1
attrs ==24.2.0
certifi ==2024.8.30
charset-normalizer ==3.4.0
click ==8.1.7
fastapi ==0.115.3
filelock ==3.16.1
frozenlist ==1.5.0
fsspec ==2024.10.0
h11 ==0.14.0
huggingface-hub ==0.26.1
idna ==3.10
jinja2 ==3.1.4
markupsafe ==3.0.2
mpmath ==1.3.0
multidict ==6.1.0
networkx ==3.4.2
numpy ==2.1.2
packaging ==24.1
prometheus-client ==0.21.0
prometheus-fastapi-instrumentator ==7.0.0
propcache ==0.2.0
py-consul ==1.5.3
pydantic ==2.9.2
pydantic-core ==2.23.4
pyyaml ==6.0.2
regex ==2024.9.11
requests ==2.32.3
safetensors ==0.4.5
sentencepiece ==0.2.0
sniffio ==1.3.1
starlette ==0.41.0
sympy ==1.13.3
tokenizers ==0.20.1
torch ==2.4.1
tqdm ==4.66.5
transformers ==4.46.1
typing-extensions ==4.12.2
urllib3 ==2.2.3
uvicorn ==0.32.0
yarl ==1.16.0

examples/t2i_adapter/requirements.txt pypi

accelerate >=0.16.0
datasets *
ftfy *
safetensors *
tensorboard *
torchvision *
transformers >=4.25.1
wandb *

examples/text_to_image/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
datasets >=2.19.1
ftfy *
peft ==0.7.0
tensorboard *
torchvision *
transformers >=4.25.1

examples/text_to_image/requirements_flax.txt pypi

Jinja2 *
datasets *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/text_to_image/requirements_sdxl.txt pypi

Jinja2 *
accelerate >=0.22.0
datasets *
ftfy *
peft ==0.7.0
tensorboard *
torchvision *
transformers >=4.25.1

examples/textual_inversion/requirements.txt pypi

Jinja2 *
accelerate >=0.16.0
ftfy *
tensorboard *
torchvision *
transformers >=4.25.1

examples/textual_inversion/requirements_flax.txt pypi

Jinja2 *
flax *
ftfy *
optax *
tensorboard *
torch *
torchvision *
transformers >=4.25.1

examples/unconditional_image_generation/requirements.txt pypi

accelerate >=0.16.0
datasets *
torchvision *

examples/vqgan/requirements.txt pypi

accelerate >=0.16.0
datasets *
numpy *
tensorboard *
timm *
torchvision *
tqdm *
transformers >=4.25.1

pyproject.toml pypi

setup.py pypi

deps *