the-last-book-bender

https://github.com/book-bender/the-last-book-bender

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: Book-Bender
License: agpl-3.0
Language: Jupyter Notebook
Default Branch: main
Size: 408 MB

Statistics

Stars: 2
Watchers: 2
Forks: 2
Open Issues: 0
Releases: 0

Created over 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme License Citation

The-Last-Book-Bender

About

The Last Book Bender is our book recommendation system that seeks to inspire a love for reading. Our goal is to create a welcoming environment where individuals of all ages and backgrounds can discover enriching literary experiences tailored to their interests and aspirations. With the proliferation of attention-grabbing sources of entertainment, our aim is to sustain peoples' passion for reading by facilitating the discovery of their next book and ensure that it is a page turner. Our objective is to develop and visualize a book recommendation system that suggests books to read by comparing content and using machine learning techniques.

Our team found ways to enhance current methods by employing larger datasets and algorithms customized for the Goodreads rating scale. Through implementing BERT embeddings and a collaborative filtering algorithm optimized for a 1 to 5 scale, we created a more accurate book recommendation system tailored to user queries. Our project promotes new discoveries in the literary world for readers and bolsters engagement in libraries. With a focus on accessibility and scalability, we seek to establish a cost-effective approach for wider adaptation of our application.

Setup

Conda and Requirements

conda create --name DVA_Final python=3.9 \
conda activate DVA_Final \
pip install -r App/requirements.txt

Flask Application

Run flask --app App/app run

Dataset

We will be using the good-books-10k-extended dataset, and the Gutenberg Project datasets.

The good-books-10k-extended has about 50k users, 10k books, 6M ratings. The Guteberg Project includes over 70,000 ebooks, and you can read more about their work here: https://www.gutenberg.org/

The good-books dataset is available for immediate download, while the Gutenberg dataset takes a bit more time and effort.

We developed a simple script, "Gutenberg Parser.py" to scrape the books, apply basic filters to remove audio books and non-Enlgish language books. The script was also used to transform the data, choosing a few fields, and condesing the content of th ebooks into a 512 word normalized sample, usable with BERT. You are welcome to remove block comments at line 60 to execute a short run for testing. There is one additional block comment at line 180/181 to remove, in that case.

We developed also a simple collaborative filtering model based on the good-books-10k-extended dataset. The model is given in Notebooks/cf_model.py, the initialization data (processed user-item matrix, with rows indexed by book_id, columns indexed by user_id, is given in Data/Raw/ratings_for_cf.npz. The metrics we chose to use is an approximate version of the adjusted cosine similarity, based on an 10-fold CV we ran testing with different metrics (other metrics are the basic cosine similarity paired with rating remapping methods or fixed baseline adjustments).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

the-last-book-bender

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

The-Last-Book-Bender

About

Setup

Conda and Requirements

Flask Application

Dataset

Visualizations and Interactivities

Owner

GitHub Events

Total

Last Year