https://github.com/capjamesg/cv-book-svg

Turn an image of a bookshelf into an interactive SVG.

https://github.com/capjamesg/cv-book-svg

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.9%) to scientific vocabulary

Keywords

books computer-vision image-analysis library personal-library
Last synced: 5 months ago · JSON representation

Repository

Turn an image of a bookshelf into an interactive SVG.

Basic Info
Statistics
  • Stars: 130
  • Watchers: 6
  • Forks: 16
  • Open Issues: 1
  • Releases: 0
Topics
books computer-vision image-analysis library personal-library
Created about 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

Make your bookshelf clickable

Use computer vision to generate an SVG that you can overlay onto a photo of your bookshelf that lets you click on each book to find out more information.

Demo

Try the demo

https://github.com/capjamesg/cv-book-svg/assets/37276661/ec57bf18-4182-4dce-870f-6bef81809e80

How it Works

This tool uses computer vision to identify and segment each book spine in an image of a bookshelf. Then, each book spine is sent to GPT-4 with Vision to read the book title and, if possible, the author.

This information is then sent to the Google Books API. The book ISBN, author name, and other meta information is retrieved from this API.

An SVG is then created using the segmented book spines. Each book is assigned a polygon which, when clicked, takes you to the Google Books page associated with a book.

This script uses the following vision tools:

  • Grounding DINO (zero-shot object detection model)
  • Segment Anything (image segmentation model)
  • GPT-4 with Vision API
  • OpenCV Python

It takes around 20 seconds to generate the polygons that map to the location of each book on an M1 Macbook Air. It then takes a few seconds to process each book with the OpenAI GPT-4 with Vision API.

For a bookshelf with 11 books, the script takes around one minute to run.

The script returns a HTML file with an SVG file that is overlaid onto the source image.

How to Use

First, clone this project and install the required dependencies:

git clone https://github.com/capjamesg/cv-book-svg cd cv-book-svg pip3 install -r requirements.txt

Then, run the main script:

python3 grounded.py --image=example.jpg --output=annotation.html

This script takes an image as input (PNG, JPEG) and outputs a HTML document.

Limitations

This system may:

  • Not identify all books on a bookshelf (thin books are more likely to not be identified).
  • Generate a link to the wrong Google Books URL (which will happen if a book is not available on Google Books, or if a book has a generic title like "Poems of Emily Dickinson", which could on its own refer to several publications).
  • Mis-identify some books.

Notes

  • video.py contains a work-in-progress system for identifying all unique books in a video.

License

This project is licensed under an MIT license.

Contributing

Found a bug? Have an idea that you'd like to see in the project? Open an Issue in this GitHub repository.

Owner

  • Name: James
  • Login: capjamesg
  • Kind: user
  • Location: Scotland
  • Company: @Roboflow

from words, wonder.

GitHub Events

Total
  • Watch event: 9
  • Fork event: 1
Last Year
  • Watch event: 9
  • Fork event: 1

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 21
  • Total Committers: 4
  • Avg Commits per committer: 5.25
  • Development Distribution Score (DDS): 0.238
Past Year
  • Commits: 21
  • Committers: 4
  • Avg Commits per committer: 5.25
  • Development Distribution Score (DDS): 0.238
Top Committers
Name Email Commits
James 3****g 16
Eduardo Balsa e****a@e****g 3
James Gallagher j****g@j****g 1
James Zhao 3****r 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 11 months ago

All Time
  • Total issues: 1
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: about 9 hours
  • Total issue authors: 1
  • Total pull request authors: 2
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.5
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • metalbot (1)
Pull Request Authors
  • jzcruiser (2)
  • drcursor (2)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • autodistill *
  • autodistill_grounded_sam *
  • numpy *
  • openai *
  • opencv-python *
  • requests *
  • supervision *
  • tqdm *