use-rse-23-astartes

Extrapolation and Interpolation in Machine Learning Modeling with Fast Food and astartes

https://github.com/jacksonburns/use-rse-23-astartes

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.1%) to scientific vocabulary

Keywords

extrapolation interpolation machine-learning reproducibility sampling
Last synced: 6 months ago · JSON representation

Repository

Extrapolation and Interpolation in Machine Learning Modeling with Fast Food and astartes

Basic Info
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
extrapolation interpolation machine-learning reproducibility sampling
Created almost 3 years ago · Last pushed over 2 years ago
Metadata Files
Readme License Citation

README.md

Extrapolation and Interpolation in Machine Learning Modeling with Fast Food and astartes

Abstract

Machine learning is a groundbreaking tool for tackling high-dimensional datasets with complex correlations that humans struggle to comprehend. An important nuance of ML is the difference between using a model for interpolation or extrapolation, meaning either inference or prediction. This work will demonstrate visually what interpolation and extrapolation mean in the context of machine learning using astartes, a Python package that makes it easy to tackle in ML modeling. Many different sampling approaches are made available with astartes, so using a very tangible dataset - a fast food menu - we can visualize how different approaches differ and then train and compare ML models.

Usage

This repository contains the split_comparisons.ipynb file and associated environment files for submission to the United States Research Software Engineer Association 2023 Conference. The notebook walks the user through the software tool astartes and its application to machine learning validation and testing. You may view the notebook in a number of different ways: 1. [Recommended] Visit the GitHub pages site at this link to view the notebook rendered as an interactive webpage with Quarto. 2. Run this notebook live and in your browser without installation using Binder: Binder 3. To execute locally: 1. clone this repository 2. build the environment with pip install -r requirements.txt using any version of Python from 3.7 to 3.11 3. open split_comparisons.ipynb in your preferred notebook IDE, i.e. jupyter or VSCode

Reproducibilty of USRSE23 Submission

The conda-environment.yml file provides the exact package versions and builds used to run the notebook for submission to the USRSE2023 conference, and requirements.txt specifies a set of 'loose' requirements as well as more 'strict' requirements that match the conda file but are cross platform. astartes has been designed to be strictly backwards compatible and reproducible, so this notebook should be identical with all minor releases of astartes v1.

Note: this repository is based on astartes's main repository, with changes to conform to the submission criteria for RSE23. Visit the astartes repository for other examples and additional detail about astartes.

Owner

  • Name: Jackson Burns
  • Login: JacksonBurns
  • Kind: user
  • Location: MIT

Chemical Engineering and Computation Researcher @ MIT

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: 12 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 17 minutes
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • JacksonBurns (1)
Top Labels
Issue Labels
Pull Request Labels