the-last-book-bender
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: Book-Bender
- License: agpl-3.0
- Language: Jupyter Notebook
- Default Branch: main
- Size: 408 MB
Statistics
- Stars: 2
- Watchers: 2
- Forks: 2
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
The-Last-Book-Bender
About
The Last Book Bender is our book recommendation system that seeks to inspire a love for reading. Our goal is to create a welcoming environment where individuals of all ages and backgrounds can discover enriching literary experiences tailored to their interests and aspirations. With the proliferation of attention-grabbing sources of entertainment, our aim is to sustain peoples' passion for reading by facilitating the discovery of their next book and ensure that it is a page turner. Our objective is to develop and visualize a book recommendation system that suggests books to read by comparing content and using machine learning techniques.
Our team found ways to enhance current methods by employing larger datasets and algorithms customized for the Goodreads rating scale. Through implementing BERT embeddings and a collaborative filtering algorithm optimized for a 1 to 5 scale, we created a more accurate book recommendation system tailored to user queries. Our project promotes new discoveries in the literary world for readers and bolsters engagement in libraries. With a focus on accessibility and scalability, we seek to establish a cost-effective approach for wider adaptation of our application.
Setup
Conda and Requirements
conda create --name DVA_Final python=3.9 \
conda activate DVA_Final \
pip install -r App/requirements.txt
Flask Application
Run flask --app App/app run
Dataset
We will be using the good-books-10k-extended dataset, and the Gutenberg Project datasets.
The good-books-10k-extended has about 50k users, 10k books, 6M ratings. The Guteberg Project includes over 70,000 ebooks, and you can read more about their work here: https://www.gutenberg.org/
The good-books dataset is available for immediate download, while the Gutenberg dataset takes a bit more time and effort.
We developed a simple script, "Gutenberg Parser.py" to scrape the books, apply basic filters to remove audio books and non-Enlgish language books. The script was also used to transform the data, choosing a few fields, and condesing the content of th ebooks into a 512 word normalized sample, usable with BERT. You are welcome to remove block comments at line 60 to execute a short run for testing. There is one additional block comment at line 180/181 to remove, in that case.
We developed also a simple collaborative filtering model based on the good-books-10k-extended dataset. The model is given in Notebooks/cf_model.py, the initialization data (processed user-item matrix, with rows indexed by book_id, columns indexed by user_id, is given in Data/Raw/ratings_for_cf.npz. The metrics we chose to use is an approximate version of the adjusted cosine similarity, based on an 10-fold CV we ran testing with different metrics (other metrics are the basic cosine similarity paired with rating remapping methods or fixed baseline adjustments).
Visualizations and Interactivities
Owner
- Name: Book Bender
- Login: Book-Bender
- Kind: organization
- Repositories: 1
- Profile: https://github.com/Book-Bender