break-for-make

https://github.com/ictmcg/break-for-make

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.2%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: ICTMCG
License: apache-2.0
Language: Python
Default Branch: main
Size: 18.6 MB

Statistics

Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed 12 months ago

Metadata Files

Readme Contributing License Code of conduct Citation

B4M: Breaking Low-Rank Adapter for Making Content-Style Customization [ACM TOG 2025]

B4M: Breaking Low-Rank Adapter for Making Content-Style Customization
Yu Xu^1,2, Fan Tang¹, Juan Cao¹, Yuxin Zhang³, Oliver Deussen⁴, Weiming Dong³, Jintao Li¹, Tong-Yee Lee⁵
¹Institute of Computing Technology, Chinese Academy of Sciences, ²University of Chinese Academy of Sciences, ³ Institute of Automation, Chinese Academy of Sciences, ⁴University of Konstanz, ⁵National Cheng Kung University

Abstract:
Personalized generation paradigms empower designers to customize visual intellectual properties with the help of textual descriptions by adapting pre-trained text-to-image models on a few images. Recent studies focus on simultaneously customizing content and detailed visual style in images but often struggle with entangling the two. In this study, we reconsider the customization of content and style concepts from the perspective of parameter space construction. Unlike existing methods that utilize a shared parameter space for content and style learning, we propose a novel framework that separates the parameter space to facilitate individual learning of content and style by introducing "partly learnable projection" (PLP) matrices to separate the original adapters into divided sub-parameter spaces. A "break-for-make" customization learning pipeline based on PLP is proposed: we first "break" the original adapters into "up projection" and "down projection" for content and style concept under orthogonal prior and then "make" the entity parameter space by reconstructing the content and style PLPs matrices by using Riemannian precondition to adaptively balance content and style learning. Experiments on various styles, including textures, materials, and artistic style, show that our method outperforms state-of-the-art single/multiple concept learning pipelines regarding content-style-prompt alignment.

More of our results

Environment Dependencies

Our code is built on Huggingface Diffusers (0.22.0), please follow sdxl for environment setup.

Install B4M

First clone this repo, and then bash cd B4M pip install -e .

Training Instructions

The training process include two stages.

Stage One: Train Content and Style Separately

Train the content model
Run the following script:

```bash

bash code/train_content.sh ```

Train the style model
Run the following script:

bash bash code/train_style.sh

Note: In both scripts, please replace any dataset paths, output directories, and other file paths with your own.

Stage Two: Joint Training with Riemannian Precondition

After completing the first stage, run the following script to start the second-stage training:

bash bash code/train_second_stage.sh

As before, make sure to update the paths in the script to fit your environment.

Inference

After training is complete, you can run inference using:

bash python infer.py

Make sure to configure the model path and input settings inside the script as needed.

LoRA Corresponding to Examples Shown in the Paper

We provide LoRA checkpoints that correspond to the examples presented in the paper. You can download them here:

Download LoRA Checkpoints from Google Drive

The corresponding reference images, prompts, and checkpoints are as follows:

| Content Reference | Style Reference | Prompt | LoRA Checkpoint | |-------------------|------------------|----------------------------------------------------------------|----------------------| | teddybear.jpg | paper.jpg | "an image of snq teddybear made from paper cutout art style" | teddybearpaper | | teddybear.jpg | yarn.jpg | "an image of snq teddybear in w@z yarn art style" | teddybearyarn | | dog1.jpg | sticker.jpg | "a snq dog in w@z sticker style" | dog1_sticker |

We are continuously uploading additional model checkpoints. Please stay tuned.

Citation

If you make use of our work, please cite our paper:

@article{xu2025b4m, title={B4M: Breaking Low-Rank Adapter for Making Content-Style Customization}, author={Xu, Yu and Tang, Fan and Cao, Juan and Zhang, Yuxin and Deussen, Oliver and Dong, Weiming and Li, Jintao and Lee, Tong-Yee}, journal={ACM Transactions on Graphics}, volume={44}, number={2}, pages={1--17}, year={2025}, publisher={ACM New York, NY}, doi={10.1145/3728461} }

Owner

Name: Media Synthesis and Forensics Lab
Login: ICTMCG
Kind: organization
Location: Beijing, China

Repositories: 23
Profile: https://github.com/ICTMCG

Media Synthesis and Forensics Lab, Institute of Computing Technology, Chinese Academy of Sciences. Our official account on WeChat: ICTMCG.

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'Diffusers: State-of-the-art diffusion models'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Patrick
    family-names: von Platen
  - given-names: Suraj
    family-names: Patil
  - given-names: Anton
    family-names: Lozhkov
  - given-names: Pedro
    family-names: Cuenca
  - given-names: Nathan
    family-names: Lambert
  - given-names: Kashif
    family-names: Rasul
  - given-names: Mishig
    family-names: Davaadorj
  - given-names: Thomas
    family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
abstract: >-
  Diffusers provides pretrained diffusion models across
  multiple modalities, such as vision and audio, and serves
  as a modular toolbox for inference and training of
  diffusion models.
keywords:
  - deep-learning
  - pytorch
  - image-generation
  - diffusion
  - text2image
  - image2image
  - score-based-generative-modeling
  - stable-diffusion
license: Apache-2.0
version: 0.12.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science