https://github.com/amanvirparhar/gempress
A script to fix basic typesetting & formatting issues in public domain eBooks.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.2%) to scientific vocabulary
Keywords
ebook
epub
project-gutenberg
Last synced: 9 months ago
·
JSON representation
Repository
A script to fix basic typesetting & formatting issues in public domain eBooks.
Basic Info
- Host: GitHub
- Owner: amanvirparhar
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://amanvir.com/blog/gempress-formatting-ebooks-with-gemini
- Size: 182 KB
Statistics
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
ebook
epub
project-gutenberg
Created over 1 year ago
· Last pushed over 1 year ago
Metadata Files
Readme
License
README.md
GemPress
![]()
A Python script that uses Google’s Gemini-1.5-Flash model to fix basic typesetting and formatting issues in public domain eBooks from Project Gutenberg.
Learn more about GemPress by reading the accompanying blog post.
Setup
- Clone the repository, and
cdinto it:bash git clone https://github.com/amanvirparhar/gempress cd gempress - Create a .env file in this directory with the following contents:
env GEMINI_API_KEY=your_api_key_here - Install
uv.
Usage
- Put the raw text file of the book you want to reformat in the same directory as
main.py. - Change the path to the text file in
main.pyto the name of the file you want to reformat. - Run the script:
bash uv run --with-requirements requirements.txt --python 3.13 main.py - You should find an ePub file in the same directory as
main.pywith the same name as the input file (except with the.epubfile extension). - Feel free to play around with
prompt.txt.
Owner
- Name: Amanvir Parhar
- Login: amanvirparhar
- Kind: user
- Location: California
- Website: amanvir.com
- Twitter: amanvirparhar
- Repositories: 10
- Profile: https://github.com/amanvirparhar
builder, cs @ umd
GitHub Events
Total
- Watch event: 4
- Push event: 3
- Create event: 1
Last Year
- Watch event: 4
- Push event: 3
- Create event: 1
Committers
Last synced: 12 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Amanvir Parhar | a****r@g****m | 4 |
Issues and Pull Requests
Last synced: 12 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
requirements.txt
pypi
- google-generativeai *
- html2image *
- pydantic *
- pypub3 *
- python-dotenv *