Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: A-Alviento
  • Language: R
  • Default Branch: main
  • Size: 1.29 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 3 years ago · Last pushed about 2 years ago
Metadata Files
Readme Citation

README.md

URECA Code Readability Tool

This repository contains the necessary files to train a logistic regression model for code readability assessment with a simple rule-based feedback tool based on the trained model.

This project is designed to assess the readability of code using machine learning techniques. It utilizes a logistic regression model trained on a dataset of code snippets. The trained model is then applied to any given code snippet to provide a readability feedback.

Getting Started

Navigate Project

  • r/ - Contains the R files used to train the logistic regression model for the readability tool.
  • features-generator.ipynb - A Jupyter Notebook to generate features from the code snippets in the dataset folder.
  • feedback_tool.ipynb - A Jupyter Notebook that provides readability feedback on given code snippets.
  • feedback_tool.py - A Python file version of feedback_tool.ipynb for use with streamlit deployment.
  • streamlit_deploy.py- A Python file for deploying the feedback tool as a Streamlit web application.

Installations

Before getting started, you'll need to install the following Python packages. You can do this by running pip install <package-name> for each one:

  • spacy
  • pyspellchecker
  • streamlit
  • indexer

Next, download en_core_web_sm model in the Natural Language Processing library, spaCy by running:

python -m spacy download en_core_web_sm

Run

  1. Clone the repository:

git clone <repository-url>

  1. Navigate to the project directory:

cd URECA-Code-Readability-New

  1. Run the Streamlit application:

streamlit run streamlit_deploy.py

  1. Open the provided local URL in your web browser to interact with the application.

Recommended Workflow

  1. The features-generator.ipynb notebook extracts features from the code snippets in the dataset folder, outputting them to a CSV file (feature_matrix_x in the r/ folder) for training the logistic regression model.

  2. The logistic regression model is trained using the R files in the r/ folder, taking the generated CSV file as input.

  3. The feedback_tool.ipynb notebook uses the trained model to provide readability feedback on given code snippets.

References

  • S. Scalabrino, M. Linares-Vásquez, R. Oliveto, and D. Poshyvanyk, “A comprehensive model for code readability,” J. Softw. Evol. Process, vol. 30, no. 6, p. e1958, 2018, doi: 10.1002/smr.1958.

Owner

  • Name: Adrian Alviento
  • Login: A-Alviento
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it using these metadata."
authors:
  - family-names: "Alviento"
    given-names: "Adrian"
    affiliation: "Nanyang Technological University"
title: "Improving Code Readability for Novice Coders: A Tool for Actionable Feedback"
version: 1.0.0
date-released: "2023-04-07"

repository-code:
  type: "Software"
  url: "https://github.com/your_username/ureca-code-readability-new"

GitHub Events

Total
Last Year