Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary
Repository
Best Prompts for Text-to-Image Models
Basic Info
Statistics
- Stars: 18
- Watchers: 5
- Forks: 6
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Best Prompts for Text-to-Image Models and How to Find Them
This repository contains code and data for Best Prompts for Text-to-Image Models and How to Find Them paper.
Code
To run the prompt optimization, you need to create a class that gets image generation queries and returns images generated with Stable Diffusion. We use the following interface:
python
class DiffusionApi:
def generate(self, prompt, steps=50, scale=7.5, seed=0, height=512, width=512):
pass
Here you pass the prompt string, the number of steps, guidance scale, seed number, and shape of the image, and the generate function returns a Pillow Image object.
To run a genetic optimization, you need to use optimize.py script that has the following arguments:
* --toloka-token. Toloka API token
* --aws-access-key-id. AWS secret key ID. Here we use it to store generated images in an AWS bucket to get direct links that will be embedded into annotation tasks
* --aws-secret-access-key. AWS secret access key
* --endpoint-url. Base URL at which your images will be stored. In other words, a URL to your AWS bucket
* --bucket. Name of the AWS bucket
* --base-pool-id. An ID of configured Toloka pool that will be cloned on every optimization iteration
Data
- The
annotation.csvcontains a CSV file with the results of the pairwise comparisons. It has five columns:prompt_idis an ID of an image description,left_uidis a UID of four left images (more details below),right_uidis a UID of four right images,workeris the worker's ID, andlabelis a worker's preference (left or right) - The
uid_to_keywords.csvcontains a mapping of image UID to keywords that it was obtained with prompts.csvcontains image descriptions. Here index of a prompt isprompt_idinannotation.csvkeywords.csvcontains keywords and their occurrences in the Stable Diffusion Discord
To obtain generated images, you need to use https://storage.yandexcloud.net/diffusion/ as a base URL and append {UID}_{0-3}.png to it. For example,
https://storage.yandexcloud.net/diffusion/0000298d546d4a6299774ca323fa7f34_0.png
Cite
- Pavlichenko, N., Ustalov, D.: Best Prompts for Text-to-Image Models and How to Find Them. In: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 2067–2071. Association for Computing Machinery, Taipei, Taiwan (2023). https://doi.org/10.1145/3539618.3592000
@inproceedings{Pavlichenko:23,
author = {Pavlichenko, Nikita and Ustalov, Dmitry},
title = {{Best Prompts for Text-to-Image Models and How to Find Them}},
year = {2023},
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
series = {SIGIR '23},
pages = {2067--2071},
address = {Taipei, Taiwan},
publisher = {Association for Computing Machinery},
doi = {10.1145/3539618.3592000},
isbn = {978-1-4503-9408-6},
eprint = {2209.11711},
eprinttype = {arxiv},
eprintclass = {cs.HC},
language = {english},
}
Owner
- Name: Toloka
- Login: Toloka
- Kind: organization
- Email: github@toloka.ai
- Website: https://toloka.ai
- Repositories: 11
- Profile: https://github.com/Toloka
Data labeling platform for ML
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: Best Prompts for Text-to-Image Models and How to Find Them
message: If you use this software, please cite the paper from preferred-citation.
type: software
authors:
- given-names: Nikita
family-names: Pavlichenko
email: pavlichenko@toloka.ai
affiliation: Toloka
orcid: 'https://orcid.org/0000-0002-7330-393X'
- given-names: Dmitry
family-names: Ustalov
email: dustalov@toloka.ai
affiliation: Toloka
orcid: 'https://orcid.org/0000-0002-9979-2188'
identifiers:
- type: doi
value: 10.1145/3539618.3592000
- type: url
value: 'https://github.com/Toloka/BestPrompts'
- type: other
value: 'arXiv:2209.11711'
repository-code: 'https://github.com/Toloka/BestPrompts'
abstract: >-
Advancements in text-guided diffusion models have allowed for the creation
of visually appealing images similar to those created by professional
artists. The effectiveness of these models depends on the composition
of the textual description, known as the prompt, and its accompanying
keywords. Evaluating aesthetics computationally is difficult, so human
input is necessary to determine the ideal prompt formulation and keyword
combination. In this study, we propose a human-in-the-loop method for
discovering the most effective combination of prompt keywords using
a genetic algorithm. Our approach demonstrates how this can lead to
an improvement in the visual appeal of images generated from the same
description.
keywords:
- text-to-image
- genetic algorithm
- data labeling
- crowdsourcing
license: Apache-2.0
preferred-citation:
type: conference-paper
authors:
- given-names: Nikita
family-names: Pavlichenko
email: pavlichenko@toloka.ai
affiliation: Toloka
orcid: 'https://orcid.org/0000-0002-7330-393X'
- given-names: Dmitry
family-names: Ustalov
email: dustalov@toloka.ai
affiliation: Toloka
orcid: 'https://orcid.org/0000-0002-9979-2188'
title: "Best Prompts for Text-to-Image Models and How to Find Them"
year: 2023
collection-title: "Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval"
conference:
name: "46th International ACM SIGIR Conference on Research and Development in Information Retrieval"
date-start: 2023-07-23
date-end: 2023-07-27
doi: "10.1145/3539618.3592000"
isbn: "978-1-4503-9408-6"
date-released: 2022-12-06
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1