121-implicit-style-content-separation-using-b-lora
https://github.com/szu-advtech-2024/121-implicit-style-content-separation-using-b-lora
Science Score: 41.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.8%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
·
Repository
Basic Info
- Host: GitHub
- Owner: SZU-AdvTech-2024
- Default Branch: main
- Size: 0 Bytes
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Created over 1 year ago
· Last pushed over 1 year ago
Metadata Files
Citation
https://github.com/SZU-AdvTech-2024/121-Implicit-Style-Content-Separation-using-B-LoRA/blob/main/
# Implicit Style-Content Separation using B-LoRA[](https://arxiv.org/abs/2403.14572) [](https://colab.research.google.com/github/yardenfren1996/B-LoRA/blob/main/B_LoRA_inference.ipynb) [](https://huggingface.co/spaces/Yardenfren/B-LoRA)  This repository contains the official implementation of the B-LoRA method, which enables implicit style-content separation of a single input image for various image stylization tasks. B-LoRA leverages the power of Stable Diffusion XL (SDXL) and Low-Rank Adaptation (LoRA) to disentangle the style and content components of an image, facilitating applications such as image style transfer, text-based image stylization, and consistent style generation. ## 21.5.2024: Important Update There were some issues with the new versions of diffusers and PEFT that caused the fine-tuning process to not converge as quickly as desired. In the meantime, we have uploaded the original training script that we used in the paper. Please note that we used a previous version of diffusers (0.25.0) and did not use PEFT. ## Getting Started ### Prerequisites - Python 3.11.6+ - PyTorch 2.1.1+ - Other dependencies (specified in `requirements.txt`) ### Installation 1. Clone this repository: ``` git clone https://github.com/yardenfren1996/B-LoRA.git cd B-LoRA ``` 2. Install the required dependencies: ``` pip install -r requirements.txt ``` (for windows 10 [here](https://github.com/yardenfren1996/B-LoRA/issues/6)) ### Usage 1. **Training B-LoRAs** To train the B-LoRAs for a given input image, run: ``` accelerate launch train_dreambooth_b-lora_sdxl.py \ --pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \ --instance_data_dir="
" \ --output_dir=" " \ --instance_prompt=" " \ --resolution=1024 \ --rank=64 \ --train_batch_size=1 \ --learning_rate=5e-5 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --max_train_steps=1000 \ --checkpointing_steps=500 \ --seed="0" \ --gradient_checkpointing \ --use_8bit_adam \ --mixed_precision="fp16" ``` This will optimize the B-LoRA weights for the content and style and store them in `output_dir`. Parameters that need to replace `instance_data_dir`, `output_dir`, `instance_prompt` (in our paper we use `A [v]`)  2. **Inference** For image stylization based on a reference style image (1), run: ``` python inference.py --prompt="A in style" --content_B_LoRA="" --style_B_LoRA=" " --output_path=" " ``` This will generate new images with the content of the first B-LoRA and the style of the second B-LoRA. Note that you need to replace `c` and `s` in the prompt according to the optimization prompt. For text-based image stylization (2), run: ``` python inference.py --prompt="A made of gold"" --content_B_LoRA=" " --output_path=" " ``` This will generate new images with the content of the given B-LoRA and the style specified by the text prompt. For consistent style generation (3), run: ``` python inference.py --prompt="A backpack in style" --style_B_LoRA="" --output_path=" " ``` This will generate new images with the specified content and the style of the given B-LoRA. Several additional parameters that you can set in the `inference.py` file include: 1. `--content_alpha`, `--style_alpha` for controlling the strength of the adapters. 2. `--num_images_per_prompt` for specifying the number of output images. (For a111 and comfy see this [issue](https://github.com/yardenfren1996/B-LoRA/issues/7)) ## Citation If you use B-LoRA in your research, please cite the following paper: ```bibtex @misc{frenkel2024implicit, title={Implicit Style-Content Separation using B-LoRA}, author={Yarden Frenkel and Yael Vinker and Ariel Shamir and Daniel Cohen-Or}, year={2024}, eprint={2403.14572}, archivePrefix={arXiv}, primaryClass={cs.CV} } ``` ## License This project is licensed under the [MIT License](LICENSE). ## Contact If you have any questions or suggestions, please feel free to open an issue or contact the authors at [yardenfren@gmail.com](mailto:yardenfren@gmail.com).
Owner
- Name: SZU-AdvTech-2024
- Login: SZU-AdvTech-2024
- Kind: organization
- Repositories: 1
- Profile: https://github.com/SZU-AdvTech-2024
Citation (citation.txt)
@inproceedings{REPO121,
author = "Frenkel, Yarden and Vinker, Yael and Shamir, Ariel and Cohen-Or, Daniel",
booktitle = "European Conference on Computer Vision",
organization = "Springer",
pages = "181--198",
title = "{Implicit style-content separation using b-lora}",
year = "2025"
}
GitHub Events
Total
- Push event: 2
- Create event: 3
Last Year
- Push event: 2
- Create event: 3