https://github.com/bytedance/id-patch

Official implementation of CVPR 2025 paper "ID-Patch: Robust ID Association for Group Photo Personalization". This work proposed propose ID-Patch, a fast and robust method that links identity features to 2D positions via visual patches and embeddings.

https://github.com/bytedance/id-patch

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, scholar.google
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.3%) to scientific vocabulary

Keywords

custom-image cvpr2025 diffusion-models group-photos image-generation personalization sdxl
Last synced: 5 months ago · JSON representation

Repository

Official implementation of CVPR 2025 paper "ID-Patch: Robust ID Association for Group Photo Personalization". This work proposed propose ID-Patch, a fast and robust method that links identity features to 2D positions via visual patches and embeddings.

Basic Info
Statistics
  • Stars: 21
  • Watchers: 3
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
custom-image cvpr2025 diffusion-models group-photos image-generation personalization sdxl
Created about 1 year ago · Last pushed 10 months ago
Metadata Files
Readme License

README.md

[CVPR 2025] ID-Patch: Robust ID Association for Group Photo Personalization

Yimeng Zhang1,2,*, Tiancheng Zhi1, Jing Liu1, Shen Sang1, Liming Jiang1, Qing Yan1, Sijia Liu2, Linjie Luo1
  1ByteDance Inc., 2Michigan State University
  *Work done during internship at ByteDance.


ID-Patch: Build Identity-to-Position Association

To address ID leakage and the linear increase in generation time with the number of identities, we propose ID-Patch, a novel method for robust identity-to-position association. From the same facial features, we generate both an ID patch—placed on the conditional image for precise spatial control—and ID embeddings, which are fused with text embeddings to enhance identity resemblance.

Environment Setup

Note: Python 3.9 and CUDA 12.2 are required. shell conda create -n idp python=3.9 conda activate idp pip install -r requirements.txt

Download models from https://huggingface.co/ByteDance/ID-Patch, and put them under models/ folder. shell git lfs install git clone https://huggingface.co/ByteDance/ID-Patch

Demo

shell python demo.py | Argument | Description | |----------|-------------| | --pose_image_path | Path to the pose image used for conditioning the generation. Default: data/poses/example_pose.png | | --subject_dir | Directory containing subject identity images. Each image should represent one person. Default: data/subjects | | --subjects | Comma-separated list of subject image filenames (e.g., exp_man.jpg,exp_woman.jpg). The order corresponds to their placement from left to right in the generated image. | | --prompt | Text prompt describing the scene to be generated. This guides the overall content and style of the output image. | | --base_model_path | Path to the base diffusion model to be used for generation. Default: RunDiffusion/Juggernaut-X-v10 | | --output_dir | Directory where the generated images will be saved. Default: results | | --output_name | Filename prefix for the generated image(s). Default: exp_result |

Disclaimer

Our released HuggingFace model differs from the paper’s version due to training on a different dataset.

License

``` Copyright 2024 Bytedance Ltd. and/or its affiliates

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ```

Citation

If you find this code useful for your research, please cite us via the BibTeX below. BibTeX @InProceedings{zhang2025idpatch, author = {Zhang, Yimeng and Zhi, Tiancheng and Liu, Jing and Sang, Shen and Jiang, Liming and Yan, Qing and Liu, Sijia and Luo, Linjie}, title = {ID-Patch: Robust ID Association for Group Photo Personalization}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2025} }

Owner

  • Name: Bytedance Inc.
  • Login: bytedance
  • Kind: organization
  • Location: Singapore

GitHub Events

Total
  • Issues event: 4
  • Watch event: 51
  • Issue comment event: 1
  • Push event: 2
  • Public event: 1
  • Pull request event: 2
  • Fork event: 3
Last Year
  • Issues event: 4
  • Watch event: 51
  • Issue comment event: 1
  • Push event: 2
  • Public event: 1
  • Pull request event: 2
  • Fork event: 3

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 3
  • Total pull requests: 1
  • Average time to close issues: about 2 hours
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 1
  • Average comments per issue: 0.33
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 1
  • Average time to close issues: about 2 hours
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 0.33
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • sborse3 (2)
  • jtj01 (1)
Pull Request Authors
  • damon-demon (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • accelerate ==0.34.2
  • diffusers ==0.30.3
  • facexlib ==0.3.0
  • insightface ==0.7.3
  • mtcnn ==0.1.1
  • numpy ==1.26.4
  • onnxruntime ==1.19.2
  • opencv-python ==4.8.0.74
  • rtmlib ==0.0.13
  • tensorflow ==2.17.0
  • torch ==2.1.2
  • transformers ==4.44.2