https://github.com/ahwang16/grounded-intuition-gpt-vision

Resources for Grounded Intuition of GPT-Vision's Abilities with Scientific Images

Science Score: 20.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
✓
Committers with academic emails
1 of 1 committers (100.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.7%) to scientific vocabulary

Keywords

cv gpt-4 grounded-theory hci images llms nlp qualitative-analysis thematic-analysis vision-language

Last synced: 10 months ago · JSON representation

Repository

Resources for Grounded Intuition of GPT-Vision's Abilities with Scientific Images

Basic Info

Host: GitHub
Owner: ahwang16
Language: Jupyter Notebook
Default Branch: master
Homepage:
Size: 8.58 MB

Statistics

Stars: 4
Watchers: 1
Forks: 0
Open Issues: 1
Releases: 0

Topics

cv gpt-4 grounded-theory hci images llms nlp qualitative-analysis thematic-analysis vision-language

Created over 2 years ago · Last pushed over 2 years ago

Metadata Files

Readme

README.md

Grounded Intuition of GPT-Vision's Abilities with Scientific Images

Overview

This is the GitHub repository for my recent article, Grounded Intuition of GPT-Vision's Abilities with Scientific Images.

~Coming soon: Colab notebook for running GPT-Vision on the API.~ Now available!

This paper contributes:

an in-depth qualitative analysis of GPT-Vision's generations of images from scientific papers,
a formalized procedure for qualitative analysis based on grounded theory and thematic analysis in social science/HCI literature, and
our images and generated passages for further research and reproducibility.

We used two prompts to generate passages for each image:

Write alt text to describe this <type>.
Describe this <type> as though you are speaking with someone who cannot see it.

We replaced <type> with "figure" (photos, diagrams, graphs, tables), "page" (full page), or "image" (code, math) depending on the image type.

The images can be found in the images directory. Each file is named with the following convention:

<type>_<id>_<short-description>.png

with decimals in image IDs replaced by hyphens. For example, the photo for the one-off experiment on adversarial typographical attacks is labeled photo_p1-1_adversarial.png.

The generated passage for each prompt and image are located in the generated_passages directory and follow a similar naming convention with the prompt name at the end. The prompts for photo_p1-1_adversarial.png can be found in photo_p1-1_adversarial_alt.png and photo_p1-1_adversarial_desc.png.

We're on the news!

As OpenAI's Multimodal API Launches Broadly, Research Shows It's Still Flawed, TechCrunch
ChatGPT-Maker OpenAI Hosts its First Big Tech Showcase as the AI Startup Faces Growing Competition, Associated Press

Suggested citation

If you would like to cite the paper or repository, you can use

@misc{hwang_grounded_2023, title={Grounded Intuition of GPT-Vision's Abilities with Scientific Images}, author={Alyssa Hwang and Andrew Head and Chris Callison-Burch}, year={2023}, eprint={2311.02069}, archivePrefix={arXiv}, primaryClass={cs.CL} }

Owner

Name: Alyssa Hwang
Login: ahwang16
Kind: user
Company: University of Pennsylvania

Website: alyssahwang.com
Repositories: 2
Profile: https://github.com/ahwang16

GitHub Events

Total

Last Year

Committers

Last synced: about 1 year ago

All Time

Total Commits: 12
Total Committers: 1
Avg Commits per committer: 12.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Alyssa Hwang	a**6@s**u	12

Committer Domains (Top 20 + Academic)

seas.upenn.edu: 1

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 1
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 1
Total pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/ahwang16/grounded-intuition-gpt-vision

Science Score: 20.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Overview

We're on the news!

Suggested citation

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels