https://github.com/blurgyy/compass

[ICCV 2025] Enhancing spatial understanding in text-to-Image diffusion models

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary

Keywords

diffusion generation spatial-understanding t2i text-to-image

Last synced: 10 months ago · JSON representation

Repository

[ICCV 2025] Enhancing spatial understanding in text-to-Image diffusion models

Basic Info

Host: GitHub
Owner: blurgyy
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://compass.blurgy.xyz
Size: 721 KB

Statistics

Stars: 54
Watchers: 5
Forks: 5
Open Issues: 2
Releases: 0

Topics

diffusion generation spatial-understanding t2i text-to-image

Created over 1 year ago · Last pushed 10 months ago

Metadata Files

Readme License

CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models

[Project Page] [arXiv]

Gaoyang Zhang, Bingtao Fu, Qingnan Fan, Qi Zhang, Runxing Liu, Hong Gu, Huaqi Zhang, Xinguo Liu
ICCV 2025

TL; DR

CoMPaSS enhances the spatial understanding of existing text-to-image diffusion models, enabling them to generate images that faithfully reflect spatial configurations specified in the text prompt.

teaser

Setting up Environment

We manage our python environment with uv, and provide a convenient script for setting up the environment at setup_env.sh. Running this script will create a subdirectory .venv/ in the project root. To enable it, run source .venv/bin/activate after the environment is set up:

```bash

install requirements into .venv/

bash ./setup_env.sh

activate the environment

source .venv/bin/activate ```

Trying out CoMPaSS

[!NOTE] For training, SCOP and TENOR are both required.
For generating images from text, only TENOR and the reference weights are needed.

Reference Weights

We provide the reference weights used to report all metrics in our paper on Hugging Face 🤗. We recommend trying out the FLUX.1-dev weights as it is a Rank-16 LoRA which is only 50MB in size.

| Model | Link | |:-----:|:-----:| | FLUX.1-dev | https://huggingface.co/blurgy/CoMPaSS-FLUX.1 | | SD1.4 | https://huggingface.co/blurgy/CoMPaSS-SD1.4 | | SD1.5 | https://huggingface.co/blurgy/CoMPaSS-SD1.5 | | SD2.1 | https://huggingface.co/blurgy/CoMPaSS-SD2.1 |

The SCOP dataset

We provide full instructions for replicating the SCOP dataset (28,028 object pairs among 15,426 images) in the SCOP directory. Check out its README to get started.

The TENOR Module

We provide both training and inference instructions for using our TENOR module in the TENOR directory. MMDiT-based models (e.g., FLUX.1-dev) and UNet-based models (e.g., SD1.5) are both supported. Check out their respective instructions to get started: - Instructions for FLUX.1-dev - Instructions for SD1.4, SD1.5, and SD2.1

Citation

bibtex @inproceedings{zhang2025compass, title={CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models}, author={Zhang, Gaoyang and Fu, Bingtao and Fan, Qingnan and Zhang, Qi and Liu, Runxing and Gu, Hong and Zhang, Huaqi and Liu, Xinguo}, booktitle={ICCV}, year={2025} }

Owner

Name: Gaoyang Zhang
Login: blurgyy
Kind: user
Company: Zhejiang University

Repositories: 6
Profile: https://github.com/blurgyy

GitHub Events

Total

Issues event: 5
Watch event: 59
Issue comment event: 5
Push event: 4
Fork event: 2
Create event: 2

Last Year

Issues event: 5
Watch event: 59
Issue comment event: 5
Push event: 4
Fork event: 2
Create event: 2

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 3
Total pull requests: 0
Average time to close issues: 8 months
Average time to close pull requests: N/A
Total issue authors: 3
Total pull request authors: 0
Average comments per issue: 0.67
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 3
Pull requests: 0
Average time to close issues: 8 months
Average time to close pull requests: N/A
Issue authors: 3
Pull request authors: 0
Average comments per issue: 0.67
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/blurgyy/compass

Science Score: 36.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models

TL; DR

Setting up Environment

install requirements into .venv/

activate the environment

Trying out CoMPaSS

Reference Weights

The SCOP dataset

The TENOR Module

Citation

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels