Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: ICTMCG
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 18.6 MB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 11 months ago · Last pushed 8 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

B4M: Breaking Low-Rank Adapter for Making Content-Style Customization [ACM TOG 2025]

B4M: Breaking Low-Rank Adapter for Making Content-Style Customization
Yu Xu1,2, Fan Tang1, Juan Cao1, Yuxin Zhang3, Oliver Deussen4, Weiming Dong3, Jintao Li1, Tong-Yee Lee5
1Institute of Computing Technology, Chinese Academy of Sciences, 2University of Chinese Academy of Sciences, 3 Institute of Automation, Chinese Academy of Sciences, 4University of Konstanz, 5National Cheng Kung University

Abstract:
Personalized generation paradigms empower designers to customize visual intellectual properties with the help of textual descriptions by adapting pre-trained text-to-image models on a few images. Recent studies focus on simultaneously customizing content and detailed visual style in images but often struggle with entangling the two. In this study, we reconsider the customization of content and style concepts from the perspective of parameter space construction. Unlike existing methods that utilize a shared parameter space for content and style learning, we propose a novel framework that separates the parameter space to facilitate individual learning of content and style by introducing "partly learnable projection" (PLP) matrices to separate the original adapters into divided sub-parameter spaces. A "break-for-make" customization learning pipeline based on PLP is proposed: we first "break" the original adapters into "up projection" and "down projection" for content and style concept under orthogonal prior and then "make" the entity parameter space by reconstructing the content and style PLPs matrices by using Riemannian precondition to adaptively balance content and style learning. Experiments on various styles, including textures, materials, and artistic style, show that our method outperforms state-of-the-art single/multiple concept learning pipelines regarding content-style-prompt alignment.

More of our results

Environment Dependencies

Our code is built on Huggingface Diffusers (0.22.0), please follow sdxl for environment setup.

Install B4M

First clone this repo, and then bash cd B4M pip install -e .

Training Instructions

The training process include two stages.

Stage One: Train Content and Style Separately

  • Train the content model
    Run the following script:

```bash

bash code/train_content.sh ```

  • Train the style model
    Run the following script:

bash bash code/train_style.sh

Note: In both scripts, please replace any dataset paths, output directories, and other file paths with your own.

Stage Two: Joint Training with Riemannian Precondition

After completing the first stage, run the following script to start the second-stage training:

bash bash code/train_second_stage.sh

As before, make sure to update the paths in the script to fit your environment.

Inference

After training is complete, you can run inference using:

bash python infer.py

Make sure to configure the model path and input settings inside the script as needed.

LoRA Corresponding to Examples Shown in the Paper

We provide LoRA checkpoints that correspond to the examples presented in the paper. You can download them here:

Download LoRA Checkpoints from Google Drive

The corresponding reference images, prompts, and checkpoints are as follows:

| Content Reference | Style Reference | Prompt | LoRA Checkpoint | |-------------------|------------------|----------------------------------------------------------------|----------------------| | teddybear.jpg | paper.jpg | "an image of snq teddybear made from paper cutout art style" | teddybearpaper | | teddybear.jpg | yarn.jpg | "an image of snq teddybear in w@z yarn art style" | teddybearyarn | | dog1.jpg | sticker.jpg | "a snq dog in w@z sticker style" | dog1_sticker |

We are continuously uploading additional model checkpoints. Please stay tuned.

Citation

If you make use of our work, please cite our paper:

@article{xu2025b4m, title={B4M: Breaking Low-Rank Adapter for Making Content-Style Customization}, author={Xu, Yu and Tang, Fan and Cao, Juan and Zhang, Yuxin and Deussen, Oliver and Dong, Weiming and Li, Jintao and Lee, Tong-Yee}, journal={ACM Transactions on Graphics}, volume={44}, number={2}, pages={1--17}, year={2025}, publisher={ACM New York, NY}, doi={10.1145/3728461} }

Owner

  • Name: Media Synthesis and Forensics Lab
  • Login: ICTMCG
  • Kind: organization
  • Location: Beijing, China

Media Synthesis and Forensics Lab, Institute of Computing Technology, Chinese Academy of Sciences. Our official account on WeChat: ICTMCG.

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'Diffusers: State-of-the-art diffusion models'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Patrick
    family-names: von Platen
  - given-names: Suraj
    family-names: Patil
  - given-names: Anton
    family-names: Lozhkov
  - given-names: Pedro
    family-names: Cuenca
  - given-names: Nathan
    family-names: Lambert
  - given-names: Kashif
    family-names: Rasul
  - given-names: Mishig
    family-names: Davaadorj
  - given-names: Thomas
    family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
abstract: >-
  Diffusers provides pretrained diffusion models across
  multiple modalities, such as vision and audio, and serves
  as a modular toolbox for inference and training of
  diffusion models.
keywords:
  - deep-learning
  - pytorch
  - image-generation
  - diffusion
  - text2image
  - image2image
  - score-based-generative-modeling
  - stable-diffusion
license: Apache-2.0
version: 0.12.1

GitHub Events

Total
  • Watch event: 4
  • Member event: 1
  • Push event: 25
  • Create event: 2
Last Year
  • Watch event: 4
  • Member event: 1
  • Push event: 25
  • Create event: 2