bestprompts

Best Prompts for Text-to-Image Models

https://github.com/toloka/bestprompts

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.9%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Best Prompts for Text-to-Image Models

Basic Info
  • Host: GitHub
  • Owner: Toloka
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 7.25 MB
Statistics
  • Stars: 18
  • Watchers: 5
  • Forks: 6
  • Open Issues: 0
  • Releases: 0
Created over 3 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

Best Prompts for Text-to-Image Models and How to Find Them

This repository contains code and data for Best Prompts for Text-to-Image Models and How to Find Them paper.

Code

To run the prompt optimization, you need to create a class that gets image generation queries and returns images generated with Stable Diffusion. We use the following interface: python class DiffusionApi: def generate(self, prompt, steps=50, scale=7.5, seed=0, height=512, width=512): pass

Here you pass the prompt string, the number of steps, guidance scale, seed number, and shape of the image, and the generate function returns a Pillow Image object.

To run a genetic optimization, you need to use optimize.py script that has the following arguments: * --toloka-token. Toloka API token * --aws-access-key-id. AWS secret key ID. Here we use it to store generated images in an AWS bucket to get direct links that will be embedded into annotation tasks * --aws-secret-access-key. AWS secret access key * --endpoint-url. Base URL at which your images will be stored. In other words, a URL to your AWS bucket * --bucket. Name of the AWS bucket * --base-pool-id. An ID of configured Toloka pool that will be cloned on every optimization iteration

Data

  • The annotation.csv contains a CSV file with the results of the pairwise comparisons. It has five columns: prompt_id is an ID of an image description, left_uid is a UID of four left images (more details below), right_uid is a UID of four right images, worker is the worker's ID, and label is a worker's preference (left or right)
  • The uid_to_keywords.csv contains a mapping of image UID to keywords that it was obtained with
  • prompts.csv contains image descriptions. Here index of a prompt is prompt_id in annotation.csv
  • keywords.csv contains keywords and their occurrences in the Stable Diffusion Discord

To obtain generated images, you need to use https://storage.yandexcloud.net/diffusion/ as a base URL and append {UID}_{0-3}.png to it. For example, https://storage.yandexcloud.net/diffusion/0000298d546d4a6299774ca323fa7f34_0.png

Cite

  • Pavlichenko, N., Ustalov, D.: Best Prompts for Text-to-Image Models and How to Find Them. In: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 2067–2071. Association for Computing Machinery, Taipei, Taiwan (2023). https://doi.org/10.1145/3539618.3592000

@inproceedings{Pavlichenko:23, author = {Pavlichenko, Nikita and Ustalov, Dmitry}, title = {{Best Prompts for Text-to-Image Models and How to Find Them}}, year = {2023}, booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval}, series = {SIGIR '23}, pages = {2067--2071}, address = {Taipei, Taiwan}, publisher = {Association for Computing Machinery}, doi = {10.1145/3539618.3592000}, isbn = {978-1-4503-9408-6}, eprint = {2209.11711}, eprinttype = {arxiv}, eprintclass = {cs.HC}, language = {english}, }

Owner

  • Name: Toloka
  • Login: Toloka
  • Kind: organization
  • Email: github@toloka.ai

Data labeling platform for ML

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Best Prompts for Text-to-Image Models and How to Find Them
message: If you use this software, please cite the paper from preferred-citation.
type: software
authors:
  - given-names: Nikita
    family-names: Pavlichenko
    email: pavlichenko@toloka.ai
    affiliation: Toloka
    orcid: 'https://orcid.org/0000-0002-7330-393X'
  - given-names: Dmitry
    family-names: Ustalov
    email: dustalov@toloka.ai
    affiliation: Toloka
    orcid: 'https://orcid.org/0000-0002-9979-2188'
identifiers:
  - type: doi
    value: 10.1145/3539618.3592000
  - type: url
    value: 'https://github.com/Toloka/BestPrompts'
  - type: other
    value: 'arXiv:2209.11711'
repository-code: 'https://github.com/Toloka/BestPrompts'
abstract: >-
  Advancements in text-guided diffusion models have allowed for the creation
  of visually appealing images similar to those created by professional
  artists. The effectiveness of these models depends on the composition
  of the textual description, known as the prompt, and its accompanying
  keywords. Evaluating aesthetics computationally is difficult, so human
  input is necessary to determine the ideal prompt formulation and keyword
  combination. In this study, we propose a human-in-the-loop method for
  discovering the most effective combination of prompt keywords using
  a genetic algorithm. Our approach demonstrates how this can lead to
  an improvement in the visual appeal of images generated from the same
  description.
keywords:
  - text-to-image
  - genetic algorithm
  - data labeling
  - crowdsourcing
license: Apache-2.0
preferred-citation:
  type: conference-paper
  authors:
  - given-names: Nikita
    family-names: Pavlichenko
    email: pavlichenko@toloka.ai
    affiliation: Toloka
    orcid: 'https://orcid.org/0000-0002-7330-393X'
  - given-names: Dmitry
    family-names: Ustalov
    email: dustalov@toloka.ai
    affiliation: Toloka
    orcid: 'https://orcid.org/0000-0002-9979-2188'
  title: "Best Prompts for Text-to-Image Models and How to Find Them"
  year: 2023
  collection-title: "Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval"
  conference:
    name: "46th International ACM SIGIR Conference on Research and Development in Information Retrieval"
    date-start: 2023-07-23
    date-end: 2023-07-27
  doi: "10.1145/3539618.3592000"
  isbn: "978-1-4503-9408-6"
date-released: 2022-12-06

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1