Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.7%) to scientific vocabulary
Last synced: 9 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: kapilw25
  • Language: Python
  • Default Branch: main
  • Size: 88.9 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created about 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme Citation

README.md

DOI

🖼️ Evaluation of Text-to-Image Generation Models

This project benchmarks and compares multiple state-of-the-art text-to-image generation models using the DeepFashion MultiModal Dataset.


🔧 Setup

bash pip install -r requirements.txt


🚀 Usage Guide

1. Generate Images from Text Prompts

bash python text2image_generation.py

  • Generates images for each listed model using base and metadata-enhanced prompts.
  • Saves images to image_generated/.

2. Evaluate Model Performance

bash python evaluation_pipeline.py

  • Computes metrics like CLIP Score, LPIPS, FID, MRR, Recall@3, and Weighted Score.
  • Saves evaluation results to results/.

3. Launch Interactive Visualization Dashboard

bash streamlit run visualization_app.py

  • View and compare model performance using visual graphs.
  • Explore system design, generated vs. ground-truth image comparisons, and model architecture insights.

Demo App

demo1 demo2

System Architecture

System Architecture

Evaluation Results

evaluation_results

Citation

If you use this repository, models, or evaluation metrics in your research or applications, please cite:

bibtex @misc{wanaskar2025multimodalbenchmarkingrecommendationtexttoimage, title={Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models}, author={Kapil Wanaskar and Gaytri Jena and Magdalini Eirinaki}, year={2025}, eprint={2505.04650}, archivePrefix={arXiv}, primaryClass={cs.GR}, url={https://arxiv.org/abs/2505.04650} }

You can also use the CITATION.cff file in this repository for automated citation support (e.g., GitHub, Zenodo).

Disclaimer

  • Precision@3 wasn't printed because in a one-to-one matching scenario—where each generated image corresponds to exactly one ground truth image—Precision@3 becomes redundant. In this context, if the correct match is in the top 3, both Precision@3 and Recall@3 would reflect a "hit." Thus, we focus on Recall@3 (and MRR) to measure retrieval performance without adding redundant metrics.

Owner

  • Name: Kapil Wanaskar
  • Login: kapilw25
  • Kind: user
  • Location: India

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this codebase, please cite the following work."
title: "Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models"
authors:
  - family-names: Wanaskar
    given-names: Kapil
  - family-names: Jena
    given-names: Gaytri
  - family-names: Eirinaki
    given-names: Magdalini
date-released: 2025-05-06
version: "1.0"
doi: 10.5281/zenodo.15385124
url: https://arxiv.org/abs/2505.04650
repository-code: https://github.com/kapilw25/Evaluation_generated_images
license: MIT
type: software

GitHub Events

Total
  • Release event: 1
  • Watch event: 1
  • Push event: 63
  • Create event: 3
Last Year
  • Release event: 1
  • Watch event: 1
  • Push event: 63
  • Create event: 3

Dependencies

requirements.txt pypi
  • Pillow >=8.0.0
  • accelerate ==1.0.1
  • bitsandbytes ==0.42.0
  • diffusers ==0.33.0.dev0
  • dvc *
  • huggingface-hub ==0.27.0
  • protobuf ==5.29.3
  • sentencepiece ==0.2.0
  • torch >=1.12.0
  • transformers >=4.26.0