https://github.com/capjamesg/cv-book-svg

Turn an image of a bookshelf into an interactive SVG.

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.9%) to scientific vocabulary

Keywords

books computer-vision image-analysis library personal-library

Last synced: 5 months ago · JSON representation

Repository

Turn an image of a bookshelf into an interactive SVG.

Basic Info

Host: GitHub
Owner: capjamesg
License: mit
Language: HTML
Default Branch: main
Homepage: https://capjamesg.github.io/cv-book-svg/
Size: 218 KB

Statistics

Stars: 130
Watchers: 6
Forks: 16
Open Issues: 1
Releases: 0

Topics

books computer-vision image-analysis library personal-library

Created about 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme License

Make your bookshelf clickable

Use computer vision to generate an SVG that you can overlay onto a photo of your bookshelf that lets you click on each book to find out more information.

Demo

Try the demo

https://github.com/capjamesg/cv-book-svg/assets/37276661/ec57bf18-4182-4dce-870f-6bef81809e80

How it Works

This tool uses computer vision to identify and segment each book spine in an image of a bookshelf. Then, each book spine is sent to GPT-4 with Vision to read the book title and, if possible, the author.

This information is then sent to the Google Books API. The book ISBN, author name, and other meta information is retrieved from this API.

An SVG is then created using the segmented book spines. Each book is assigned a polygon which, when clicked, takes you to the Google Books page associated with a book.

This script uses the following vision tools:

Grounding DINO (zero-shot object detection model)
Segment Anything (image segmentation model)
GPT-4 with Vision API
OpenCV Python

It takes around 20 seconds to generate the polygons that map to the location of each book on an M1 Macbook Air. It then takes a few seconds to process each book with the OpenAI GPT-4 with Vision API.

For a bookshelf with 11 books, the script takes around one minute to run.

The script returns a HTML file with an SVG file that is overlaid onto the source image.

How to Use

First, clone this project and install the required dependencies:

git clone https://github.com/capjamesg/cv-book-svg cd cv-book-svg pip3 install -r requirements.txt

Then, run the main script:

python3 grounded.py --image=example.jpg --output=annotation.html

This script takes an image as input (PNG, JPEG) and outputs a HTML document.

Limitations

This system may:

Not identify all books on a bookshelf (thin books are more likely to not be identified).
Generate a link to the wrong Google Books URL (which will happen if a book is not available on Google Books, or if a book has a generic title like "Poems of Emily Dickinson", which could on its own refer to several publications).
Mis-identify some books.

Notes

video.py contains a work-in-progress system for identifying all unique books in a video.

License

This project is licensed under an MIT license.

Contributing

Found a bug? Have an idea that you'd like to see in the project? Open an Issue in this GitHub repository.

Owner

Name: James
Login: capjamesg
Kind: user
Location: Scotland
Company: @Roboflow

Website: jamesg.blog
Repositories: 320
Profile: https://github.com/capjamesg

from words, wonder.

GitHub Events

Total

Watch event: 9
Fork event: 1

Last Year

Watch event: 9
Fork event: 1

Committers

Last synced: over 1 year ago

All Time

Total Commits: 21
Total Committers: 4
Avg Commits per committer: 5.25
Development Distribution Score (DDS): 0.238

Past Year

Commits: 21
Committers: 4
Avg Commits per committer: 5.25
Development Distribution Score (DDS): 0.238

Top Committers

Name	Email	Commits
James	3****g	16
Eduardo Balsa	e**a@e**g	3
James Gallagher	j**g@j**g	1
James Zhao	3****r	1

Committer Domains (Top 20 + Academic)

jamesg.blog: 1 ebserver.org: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 1
Total pull requests: 2
Average time to close issues: N/A
Average time to close pull requests: about 9 hours
Total issue authors: 1
Total pull request authors: 2
Average comments per issue: 1.0
Average comments per pull request: 0.5
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

metalbot (1)

Pull Request Authors

jzcruiser (2)
drcursor (2)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

requirements.txt pypi

autodistill *
autodistill_grounded_sam *
numpy *
openai *
opencv-python *
requests *
supervision *
tqdm *

https://github.com/capjamesg/cv-book-svg

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Make your bookshelf clickable

Demo

How it Works

How to Use

Limitations

Notes

License

Contributing

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies