https://github.com/animesh/ci-analysis-minimal-example

A minimal example of a data analysis using continuous integration via Github actions

https://github.com/animesh/ci-analysis-minimal-example

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: nature.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

A minimal example of a data analysis using continuous integration via Github actions

Basic Info
  • Host: GitHub
  • Owner: animesh
  • License: mit
  • Default Branch: main
  • Size: 524 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of poldrack/ci-analysis-minimal-example
Created about 4 years ago · Last pushed over 4 years ago

https://github.com/animesh/ci-analysis-minimal-example/blob/main/

# ci-analysis-minimal-example

![example workflow](https://github.com/poldrack/ci-analysis-minimal-example/actions/workflows/RunMinimalCIAnalysis.yml/badge.svg)

A minimal example of a data analysis using continuous integration via Github actions

## Rationale

An important aspect of reproducibility is the ability to automatically run an analysis workflow and obtain the results.  One useful way to accomplish this is by leveraging tools that have been developed for professional software development -- namely, the tools of continuous integration (CI).  CI enables the testing of code whenever a new commit is pushed to a Github repository, and allows the results to be stored out for future use.  There are several platforms available for CI; in this example we will use Github Actions, since it is nicely integrated into Github. 

## Workflow

In this example we will using existing data (previously published by [Eisenberg et al., 2019](https://www.nature.com/articles/s41467-019-10301-1)) perform a statistical analysis to test whether trait impulsivity is associated with being arrested.  We implement the analysis in two ways.  The data have been slightly reorganized from their [source](https://github.com/IanEisenberg/Self_Regulation_Ontology) to roughly match the [Psych-DS](https://psych-ds.github.io/) file organization scheme.  

### Implementing analyses as tests

The most common usage of CI is to perform tests to ensure the proper operation of the code.  It is straightforward to implement analyses as software tests that can be run by the [pytest](https://docs.pytest.org/) framework. These are implemented in `results/tests_all.py`; each of the tests calls a specific set of functions that perform the analyses and store the results.  

### Rendering a Jupyter notebook

Analyses are often implemented in computational notebooks, such as Jupyter or RMarkdown. It is possible to render these directly to a file; in this example we render to an html output for simplicity, but one could also render to other formats such as PDF.

## Outputs

Upon a new commit, all analyses are rerun by Github Actions; the status can be viewed from the Actions link above.  Upon completion, the outputs are stored in an artifact file that can be downloaded from the Actions page.  These are only stored for 90 days, but can be regenerated by simply re-running the Action.

Owner

  • Name: Ani
  • Login: animesh
  • Kind: user
  • Location: Norway
  • Company: Norwegian University of Science and Technology

A medical graduate from Delhi University with post-graduation in bioinformatics from Jawaharlal Nehru University, India.

GitHub Events

Total
Last Year