evaluation_generated_images
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: kapilw25
- Language: Python
- Default Branch: main
- Size: 88.9 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
🖼️ Evaluation of Text-to-Image Generation Models
This project benchmarks and compares multiple state-of-the-art text-to-image generation models using the DeepFashion MultiModal Dataset.
🔧 Setup
bash
pip install -r requirements.txt
🚀 Usage Guide
1. Generate Images from Text Prompts
bash
python text2image_generation.py
- Generates images for each listed model using base and metadata-enhanced prompts.
- Saves images to
image_generated/.
2. Evaluate Model Performance
bash
python evaluation_pipeline.py
- Computes metrics like CLIP Score, LPIPS, FID, MRR, Recall@3, and Weighted Score.
- Saves evaluation results to
results/.
3. Launch Interactive Visualization Dashboard
bash
streamlit run visualization_app.py
- View and compare model performance using visual graphs.
- Explore system design, generated vs. ground-truth image comparisons, and model architecture insights.
Demo App

System Architecture
Evaluation Results

Citation
If you use this repository, models, or evaluation metrics in your research or applications, please cite:
bibtex
@misc{wanaskar2025multimodalbenchmarkingrecommendationtexttoimage,
title={Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models},
author={Kapil Wanaskar and Gaytri Jena and Magdalini Eirinaki},
year={2025},
eprint={2505.04650},
archivePrefix={arXiv},
primaryClass={cs.GR},
url={https://arxiv.org/abs/2505.04650}
}
You can also use the CITATION.cff file in this repository for automated citation support (e.g., GitHub, Zenodo).
Disclaimer
- Precision@3 wasn't printed because in a one-to-one matching scenario—where each generated image corresponds to exactly one ground truth image—Precision@3 becomes redundant. In this context, if the correct match is in the top 3, both Precision@3 and Recall@3 would reflect a "hit." Thus, we focus on Recall@3 (and MRR) to measure retrieval performance without adding redundant metrics.
Owner
- Name: Kapil Wanaskar
- Login: kapilw25
- Kind: user
- Location: India
- Website: https://www.linkedin.com/in/kapil-wanaskar-06507483/
- Repositories: 1
- Profile: https://github.com/kapilw25
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this codebase, please cite the following work."
title: "Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models"
authors:
- family-names: Wanaskar
given-names: Kapil
- family-names: Jena
given-names: Gaytri
- family-names: Eirinaki
given-names: Magdalini
date-released: 2025-05-06
version: "1.0"
doi: 10.5281/zenodo.15385124
url: https://arxiv.org/abs/2505.04650
repository-code: https://github.com/kapilw25/Evaluation_generated_images
license: MIT
type: software
GitHub Events
Total
- Release event: 1
- Watch event: 1
- Push event: 63
- Create event: 3
Last Year
- Release event: 1
- Watch event: 1
- Push event: 63
- Create event: 3
Dependencies
- Pillow >=8.0.0
- accelerate ==1.0.1
- bitsandbytes ==0.42.0
- diffusers ==0.33.0.dev0
- dvc *
- huggingface-hub ==0.27.0
- protobuf ==5.29.3
- sentencepiece ==0.2.0
- torch >=1.12.0
- transformers >=4.26.0