ReadmeReady

ReadmeReady: Free and Customizable Code Documentation with LLMs - A Fine-Tuning Approach - Published in JOSS (2025)

https://github.com/souradipp76/readmeready

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README and JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org
○
Academic email domains
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Keywords

fine-tuning llm qlora retrieval-augmented-generation

Last synced: 11 months ago · JSON representation

Repository

Tool for auto-generating README documentation for code repositories

Basic Info

Host: GitHub
Owner: souradipp76
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://souradipp76.github.io/ReadMeReady/
Size: 20.5 MB

Statistics

Stars: 13
Watchers: 1
Forks: 1
Open Issues: 0
Releases: 8

Topics

fine-tuning llm qlora retrieval-augmented-generation

Created over 2 years ago · Last pushed 11 months ago

Metadata Files

Readme Changelog Contributing Funding License

ReadmeReady

Auto-generate code documentation in Markdown format in seconds.

What is ReadmeReady?

Automated documentation of programming source code is a challenging task with significant practical and scientific implications for the developer community. ReadmeReady is a large language model (LLM)-based application that developers can use as a support tool to generate basic documentation for any publicly available or custom repository. Over the last decade, several research have been done on generating documentation for source code using neural network architectures. With the recent advancements in LLM technology, some open-source applications have been developed to address this problem. However, these applications typically rely on the OpenAI APIs, which incur substantial financial costs, particularly for large repositories. Moreover, none of these open-source applications offer a fine-tuned model or features to enable users to fine-tune custom LLMs. Additionally, finding suitable data for fine-tuning is often challenging. Our application addresses these issues.

Installation

ReadmeReady is available only on Linux/Windows.

Dependencies

Please follow the installation guide here to install python-magic.

Install it from PyPI

The simplest way to install ReadmeReady and its dependencies is from PyPI with pip, Python's preferred package installer.

bash pip install readme_ready

In order to upgrade ReadmeReady to the latest version, use pip as follows.

bash $ pip install -U readme_ready

Install it from source

You can also install ReadmeReady from source as follows.

bash $ git clone https://github.com/souradipp76/ReadMeReady.git $ cd ReadMeReady $ make install

To create a virtual environment before installing ReadmeReady, you can use the command: bash $ make virtualenv $ source .venv/bin/activate

Usage

Initialize

bash $ export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY> $ export HF_TOKEN=<YOUR_HUGGINGFACE_TOKEN>

Set OPENAI_API_KEY=dummy to use only open-source models.

Command-Line

```bash $ python -m readme_ready

or

$ readme_ready ```

In Code

```py from readmeready.query import query from readmeready.index import index from readme_ready.types import ( AutodocReadmeConfig, AutodocRepoConfig, AutodocUserConfig, LLMModels, )

model = LLMModels.LLAMA27BCHAT_GPTQ # Choose model from supported models

repoconfig = AutodocRepoConfig ( name = "<REPOSITORYNAME>", # Replace root = "", # Replace repositoryurl = "<REPOSITORYURL>", # Replace output = "", # Replace llms = [model], peftmodelpath = "", # Replace ignore = [ ".", "package-lock.json", "package.json", "node_modules", "dist", "build", "test", ".svg", ".md", ".mdx", "*.toml" ], fileprompt = "", folderprompt = "", chatprompt = "", contenttype = "docs", targetaudience = "smart developer", linkhosted = True, priority = None, maxconcurrentcalls = 50, add_questions = False, device = "auto", # Select device "cpu" or "auto" )

user_config = AutodocUserConfig( llms = [model] )

readme_config = AutodocReadmeConfig( # Set comma separated list of README headings headings = "Description,Requirements,Installation,Usage,Contributing,License" )

index.index(repoconfig) query.generatereadme(repoconfig, userconfig, readme_config) ```

Run the sample script in the examples/example.py to see a typical code usage. See example on Google Colab:

See detailed API references here.

Finetuning

For finetuning on custom datasets, follow the instructions below.

Run the notebook file scripts/data.ipynb and follow the instructions in the file to generate custom dataset from open-source repositories.
Run the notebook file scripts/fine-tuning-with-llama2-qlora.ipynb and follow the instructions in the file to finetune custom LLMs.

The results are reported in Table 1 and Table 2, under the "With FT" or "With Finetuning" columns where the contents are compared with each repository's original README file. It is observed that BLEU scores range from 15 to 30, averaging 20, indicating that the generated text is understandable but requires substantial editing to be acceptable. Conversely, BERT scores reveal a high semantic similarity to the original README content, with an average F1 score of ~85%.

Table 1: BLEU Scores

| Repository | W/O FT | With FT | |------------|--------|---------| | allennlp | 32.09 | 16.38 | | autojump | 25.29 | 18.73 | | numpy-ml | 16.61 | 19.02 | | Spleeter | 18.33 | 19.47 | | TouchPose | 17.04 | 8.05 |

Table 2: BERT Scores

| Repository | P (W/O FT) | R (W/O FT) | F1 (W/O FT) | P (With FT) | R (With FT) | F1 (With FT) | |------------|------------|------------|-------------|-------------|-------------|--------------| | allennlp | 0.904 | 0.8861 | 0.895 | 0.862 | 0.869 | 0.865 | | autojump | 0.907 | 0.86 | 0.883 | 0.846 | 0.87 | 0.858 | | numpy-ml | 0.89 | 0.881 | 0.885 | 0.854 | 0.846 | 0.85 | | Spleeter | 0.86 | 0.845 | 0.852 | 0.865 | 0.866 | 0.865 | | TouchPose | 0.87 | 0.841 | 0.856 | 0.831 | 0.809 | 0.82 |

Validation

Run the script scripts/run_validate.sh to generate BLEU and BERT scores for 5 sample repositories comparing the actual README file with the generated ones. Note that to reproduce the scores, a GPU with 16GB or more is required.

bash $ chmod +x scripts/run_validate.sh $ scripts/run_validate.sh

Alternatively, run the notebook scripts/validate.ipynb on Google Colab:

Supported models

TINYLLAMA1p1BCHAT_GGUF (TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF)
GOOGLEGEMMA2BINSTRUCTGGUF (bartowski/gemma-2-2b-it-GGUF)
LLAMA27BCHAT_GPTQ (TheBloke/Llama-2-7B-Chat-GPTQ)
LLAMA213BCHAT_GPTQ (TheBloke/Llama-2-13B-Chat-GPTQ)
CODELLAMA7BINSTRUCT_GPTQ (TheBloke/CodeLlama-7B-Instruct-GPTQ)
CODELLAMA13BINSTRUCT_GPTQ (TheBloke/CodeLlama-13B-Instruct-GPTQ)
LLAMA27BCHAT_HF (meta-llama/Llama-2-7b-chat-hf)
LLAMA213BCHAT_HF (meta-llama/Llama-2-13b-chat-hf)
CODELLAMA7BINSTRUCT_HF (meta-llama/CodeLlama-7b-Instruct-hf)
CODELLAMA13BINSTRUCT_HF (meta-llama/CodeLlama-13b-Instruct-hf)
GOOGLEGEMMA2B_INSTRUCT (google/gemma-2b-it)
GOOGLEGEMMA7B_INSTRUCT (google/gemma-7b-it)
GOOGLECODEGEMMA2B (google/codegemma-2b)
GOOGLECODEGEMMA7B_INSTRUCT (google/codegemma-7b-it)

Contributing

ReadmeReady is an open-source project that is supported by a community who will gratefully and humbly accept any contributions you might make to the project.

If you are interested in contributing, read the CONTRIBUTING.md file.

Submit a bug report or feature request on GitHub Issues.
Add to the documentation or help with our website.
Write unit or integration tests for our project under the tests directory.
Answer questions on our issues, mailing list, Stack Overflow, and elsewhere.
Write a blog post, tweet, or share our project with others.

As you can see, there are lots of ways to get involved, and we would be very happy for you to join us!

License

Read the LICENSE file.

Owner

Name: Souradip Pal
Login: souradipp76
Kind: user
Location: West Lafayette, Indiana
Company: @Purdue

Website: souradipp76.github.io
Twitter: souradip_pal96
Repositories: 33
Profile: https://github.com/souradipp76

MS in Computer Engg. @ Purdue University | Ex Software Engineer @ Adobe | B.Tech in Electronics & Communication Engg. @ IIT Guwahati

JOSS Publication

ReadmeReady: Free and Customizable Code Documentation with LLMs - A Fine-Tuning Approach

Published

April 12, 2025

DOI

10.21105/joss.07489

Volume 10, Issue 108, Page 7489

Authors

Sayak Chakrabarty

Northwestern University

Souradip Pal

Purdue University

Editor

Chris Vernon

GitHub Events

Total

Create event: 14
Issues event: 18
Release event: 7
Watch event: 13
Delete event: 7
Issue comment event: 37
Public event: 1
Push event: 94
Pull request review event: 2
Pull request event: 9
Fork event: 1

Last Year

Create event: 14
Issues event: 18
Release event: 7
Watch event: 13
Delete event: 7
Issue comment event: 37
Public event: 1
Push event: 94
Pull request review event: 2
Pull request event: 9
Fork event: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 4
Total pull requests: 4
Average time to close issues: 3 months
Average time to close pull requests: about 12 hours
Total issue authors: 2
Total pull request authors: 2
Average comments per issue: 7.0
Average comments per pull request: 0.5
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 2

Past Year

Issues: 4
Pull requests: 4
Average time to close issues: 3 months
Average time to close pull requests: about 12 hours
Issue authors: 2
Pull request authors: 2
Average comments per issue: 7.0
Average comments per pull request: 0.5
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 2

View more stats

Top Authors

Issue Authors

Manvi-Agrawal (7)
dclydew (1)

Pull Request Authors

dependabot[bot] (3)
hellokayas (2)
souradipp76 (1)
Manvi-Agrawal (1)

Top Labels

Issue Labels

bug (1) help wanted (1) enhancement (1) question (1)

Pull Request Labels

dependencies (3) github_actions (2)

Packages

Total packages: 1
Total downloads:
- pypi 26 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 5
Total maintainers: 2

pypi.org: readme-ready

Auto-generate code documentation in Markdown format in seconds.

Homepage: https://github.com/souradipp76/ReadMeReady/
Documentation: https://readme-ready.readthedocs.io/
License: apache-2.0
Latest release: 1.1.5
published over 1 year ago

Versions: 5
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 26 Last month

Rankings

Dependent packages count: 10.0%

Average: 33.3%

Dependent repos count: 56.5%

Maintainers (2)

souradip_pal hellokayas

Last synced: 11 months ago

Dependencies

.github/workflows/main.yml actions

actions/checkout v4 composite
actions/setup-python v5 composite
codecov/codecov-action v4 composite

.github/workflows/release.yml actions

actions/checkout v4 composite
actions/setup-python v5 composite
softprops/action-gh-release v2 composite

.github/workflows/rename_project.yml actions

actions/checkout v4 composite
stefanzweifel/git-auto-commit-action v5 composite

requirements-test.txt pypi

black * test
callouts * test
coverage * test
flake8 * test
gitchangelog * test
isort * test
mkdocs * test
mypy * test
pymdown-extensions * test
pytest * test
pytest-cov * test

requirements.txt pypi

accelerate *
auto-gptq *
bitsandbytes *
gguf *
hnswlib *
langchain *
langchain_experimental *
langchain_huggingface *
langchain_openai *
markdown2 *
optimum *
peft *
pymarkdownlnt *
python-magic *
questionary *
sentence_transformers *
sentencepiece *
torch *

setup.py pypi

ReadmeReady

Science Score: 93.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

ReadmeReady

What is ReadmeReady?

Installation

Dependencies

Install it from PyPI

Install it from source

Usage

Initialize

Command-Line

or

In Code

Finetuning

Table 1: BLEU Scores

Table 2: BERT Scores

Validation

Supported models

Contributing

License

Owner

JOSS Publication

ReadmeReady: Free and Customizable Code Documentation with LLMs - A Fine-Tuning Approach

Authors

Editor

Tags

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: readme-ready

Rankings

Maintainers (2)

Dependencies