ntu-ureca-code-readability
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: A-Alviento
- Language: R
- Default Branch: main
- Size: 1.29 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
URECA Code Readability Tool
This repository contains the necessary files to train a logistic regression model for code readability assessment with a simple rule-based feedback tool based on the trained model.
This project is designed to assess the readability of code using machine learning techniques. It utilizes a logistic regression model trained on a dataset of code snippets. The trained model is then applied to any given code snippet to provide a readability feedback.
Getting Started
Navigate Project
r/- Contains the R files used to train the logistic regression model for the readability tool.features-generator.ipynb- A Jupyter Notebook to generate features from the code snippets in the dataset folder.feedback_tool.ipynb- A Jupyter Notebook that provides readability feedback on given code snippets.feedback_tool.py- A Python file version of feedback_tool.ipynb for use with streamlit deployment.streamlit_deploy.py- A Python file for deploying the feedback tool as a Streamlit web application.
Installations
Before getting started, you'll need to install the following Python packages. You can do this by running pip install <package-name> for each one:
- spacy
- pyspellchecker
- streamlit
- indexer
Next, download en_core_web_sm model in the Natural Language Processing library, spaCy by running:
python -m spacy download en_core_web_sm
Run
- Clone the repository:
git clone <repository-url>
- Navigate to the project directory:
cd URECA-Code-Readability-New
- Run the Streamlit application:
streamlit run streamlit_deploy.py
- Open the provided local URL in your web browser to interact with the application.
Recommended Workflow
The
features-generator.ipynbnotebook extracts features from the code snippets in thedatasetfolder, outputting them to a CSV file (feature_matrix_xin ther/folder) for training the logistic regression model.The logistic regression model is trained using the R files in the
r/folder, taking the generated CSV file as input.The
feedback_tool.ipynbnotebook uses the trained model to provide readability feedback on given code snippets.
References
- S. Scalabrino, M. Linares-Vásquez, R. Oliveto, and D. Poshyvanyk, “A comprehensive model for code readability,” J. Softw. Evol. Process, vol. 30, no. 6, p. e1958, 2018, doi: 10.1002/smr.1958.
Owner
- Name: Adrian Alviento
- Login: A-Alviento
- Kind: user
- Repositories: 10
- Profile: https://github.com/A-Alviento
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it using these metadata."
authors:
- family-names: "Alviento"
given-names: "Adrian"
affiliation: "Nanyang Technological University"
title: "Improving Code Readability for Novice Coders: A Tool for Actionable Feedback"
version: 1.0.0
date-released: "2023-04-07"
repository-code:
type: "Software"
url: "https://github.com/your_username/ureca-code-readability-new"