scene-representation-diffusion-model

Linear probe found representations of scene attributes in a text-to-image diffusion model

https://github.com/yc015/scene-representation-diffusion-model

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.2%) to scientific vocabulary

Keywords

explainability image-editing interpretability scene stable-diffusion
Last synced: 6 months ago · JSON representation

Repository

Linear probe found representations of scene attributes in a text-to-image diffusion model

Basic Info
Statistics
  • Stars: 34
  • Watchers: 7
  • Forks: 4
  • Open Issues: 0
  • Releases: 0
Topics
explainability image-editing interpretability scene stable-diffusion
Created over 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model

Linear probes found controllable representations of scene attributes in a text-to-image diffusion model

Project page of "Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model"
Paper arXiv link: https://arxiv.org/abs/2306.05720
[NeurIPS link] [Poster link]

How to generate a short video of moving foreground object using a pretrained text-to-image generative model?

See applicationofintervention.ipynb for how to use our intervention technique to generate a short video of moving objects.

Some examples:

The gifs are sampled using the original text-to-image diffusion model without fine-tuning. All frames are generated using the same prompt, random seed (inital latent vectors), and model. We edited the intermediate activations of the latent diffusion model when it generated the images so its internal representtaion of foreground match with our reference mask. See notebook for implementation details.

Probe Weights:

Unzip the probe_checkpoints.zip to acquire all probe weights trained by us. The probe weights in the unzipped folder should be sufficient for you to run all experiments shown in the paper.

Citation

If you find the source code of this repo helpful, please cite

@article{chen2023beyond,
  title={Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model},
  author={Chen, Yida and Vi{\'e}gas, Fernanda and Wattenberg, Martin},
  journal={arXiv preprint arXiv:2306.05720},
  year={2023}
}

Owner

  • Name: Yida Chen
  • Login: yc015
  • Kind: user
  • Location: Cambridge, MA
  • Company: Harvard University

Ph.D. student at Harvard University.

GitHub Events

Total
  • Watch event: 1
  • Fork event: 2
Last Year
  • Watch event: 1
  • Fork event: 2