scene-representation-diffusion-model

Linear probe found representations of scene attributes in a text-to-image diffusion model

https://github.com/yc015/scene-representation-diffusion-model

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary

Keywords

explainability image-editing interpretability scene stable-diffusion

Last synced: 6 months ago · JSON representation

Repository

Linear probe found representations of scene attributes in a text-to-image diffusion model

Basic Info

Host: GitHub
Owner: yc015
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage: https://yc015.github.io/scene-representation-diffusion-model/
Size: 73.7 MB

Statistics

Stars: 34
Watchers: 7
Forks: 4
Open Issues: 0
Releases: 0

Topics

explainability image-editing interpretability scene stable-diffusion

Created over 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme License Citation

Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model

Linear probes found controllable representations of scene attributes in a text-to-image diffusion model

Project page of "Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model"
Paper arXiv link: https://arxiv.org/abs/2306.05720
[NeurIPS link] [Poster link]

How to generate a short video of moving foreground object using a pretrained text-to-image generative model?

See applicationofintervention.ipynb for how to use our intervention technique to generate a short video of moving objects.

Some examples:

The gifs are sampled using the original text-to-image diffusion model without fine-tuning. All frames are generated using the same prompt, random seed (inital latent vectors), and model. We edited the intermediate activations of the latent diffusion model when it generated the images so its internal representtaion of foreground match with our reference mask. See notebook for implementation details.

Probe Weights:

Unzip the probe_checkpoints.zip to acquire all probe weights trained by us. The probe weights in the unzipped folder should be sufficient for you to run all experiments shown in the paper.

Citation

If you find the source code of this repo helpful, please cite

@article{chen2023beyond,
  title={Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model},
  author={Chen, Yida and Vi{\'e}gas, Fernanda and Wattenberg, Martin},
  journal={arXiv preprint arXiv:2306.05720},
  year={2023}
}

Owner

Name: Yida Chen
Login: yc015
Kind: user
Location: Cambridge, MA
Company: Harvard University

Website: https://yc015.github.io/
Repositories: 4
Profile: https://github.com/yc015

Ph.D. student at Harvard University.

GitHub Events

Total

Watch event: 1
Fork event: 2

Last Year

Watch event: 1
Fork event: 2

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science