https://github.com/alicerunsonfedora/gh-twilight

Predict a repository's size based on a contributor's commit history.

https://github.com/alicerunsonfedora/gh-twilight

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Predict a repository's size based on a contributor's commit history.

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 6 years ago · Last pushed about 6 years ago
Metadata Files
Readme License

README.md

Project Twilight

License Tests

Project Twilight is a machine learning experiment for my DMC345 (Intro to Machine Learning) course that tries to predict a Git repository's size by reading a list of numbers that represent commits in a week. This tool makes use of SciKit Learn, matplotlib, and NumPy to handle data manipulation and analysis and PyGithub to fetch data from GitHub.

Installation

Install via PyPI (TBD)

Run pip install gh-twilight to install the tool to your Python environment.

Build from source

Requirements

  • Python 3.7 or higher
  • Poetry package manager

Clone the repository and run poetry install in the project root to set up the environment and install dependencies.

Creating a configuration file

To create the config file that Project Twilight uses, run gh-twilight --generate in your terminal. The configuration utility will help you set up some details about what models you want to use for training, your GitHub access token, and how you want to predict your data.

Running the utility

Configuration arguments

  • --config CONFIG: The path to the configuration file to use. --generate: Runs the interactive configuration utility.

Analysis and prediction arguments

  • --plot: Creates plot graphs of predicted and testing data from training the network.
  • --predict: Run predictions on the data provided in the configuration file.

Extra arguments

  • --log-file LOG_FILE: The path to where you want the logs to be store. Omitting this argument will disable logging.
  • --csv: Exports the raw dataset to a CSV file before analysis.
  • --json: Exports the raw dataset to a JSON file before analysis.

Sparkle configuration file

The configuration file (in TOML syntax) contains important information on how to collect data, what data to collect, and how to run analysis and predictions. There are three important keys in the configuration file:

  • config.account: Includes GitHub personal token and Git username.
  • config.activities: Includes what repository to use as training data and what models to use.
  • config.predictions: Includes what model to use to make predictions and inputs to predict.

Account information

The config.account section includes the following keys:

  • git_name: The Git username that made the commits to the repository
  • token: The GitHub personal token with the repo permission.

Activity configuration

The config.activities section includes the following keys:

  • models: A list of strings containing what models to use. Valid options are forest, neural, and linear.
  • repos: A list of strings containing the repositories on GitHub to use as training data.

Prediction configuration

The config.predictions section includes the following keys:

  • method: The model to use to make predictions. Valid options are forest, neural, linear, and best.
    • Using best will automatically determine the best model to use by using the model with the highest R2 accuracy score.
  • inputs: A list of dictionaries that contain the input values to predict. The dictionary should have the following keys:
    • name: The name of the repository. This does not need to point to a real repository on GitHub.
    • commits: A list containing seven integers that represent how many commits are made on the weekdays if all weeks are combined. For example, if a user make two commits to a repository every day for two weeks, the commits list should be [4, 4, 4, 4, 4, 4, 4].

An example configuration may look like the following: ```toml [config.account] git_name = "Twilight Sparkle" token = "githash"

[config.activities] models = [ "forest", "linear" ] repos = [ "equestria/friendship.equ", "equestria/governance" ]

[config.predictions] method = "best" inputs = [{ name = "equestria/journal", commits = [1, 13, 9, 8, 7, 12, 8] }] ```

License

Project Twilight is free and open-source software licensed under the Mozilla Public License, v2.0.

Owner

  • Name: Marquis Kurt
  • Login: alicerunsonfedora
  • Kind: user
  • Location: Bear, DE

[mar.kɪs kɚrt] He/him. iOS app and game developer.

GitHub Events

Total
Last Year

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 12
  • Total Committers: 1
  • Avg Commits per committer: 12.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Marquis Kurt s****e@m****t 12
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels