reproducibleRchunks

This package allows R code chunks, which can be tested for reproducibility.

https://github.com/brandmaier/reproduciblerchunks

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.8%) to scientific vocabulary

Keywords

r reproducibility
Last synced: 6 months ago · JSON representation ·

Repository

This package allows R code chunks, which can be tested for reproducibility.

Basic Info
  • Host: GitHub
  • Owner: brandmaier
  • License: other
  • Language: HTML
  • Default Branch: main
  • Homepage:
  • Size: 2.07 MB
Statistics
  • Stars: 38
  • Watchers: 1
  • Forks: 2
  • Open Issues: 7
  • Releases: 2
Topics
r reproducibility
Created almost 2 years ago · Last pushed 7 months ago
Metadata Files
Readme Changelog License Citation

README.md

reproducibleRchunks

<!-- badges: start --> Lifecycle: stable <!-- badges: end -->

Also read our Open Access publication about this package in Collabra: Psychology: Automated Reproducibility Testing in R Markdown

Why should I care?

This package allows you to make computational results in R testable for reproduction (does the same script with the same data produce the same results, e.g. on a different computer and/or later in time). There is only a single thing you need to change in your analysis if you are already using RMarkdown: Load the package at the beginning of your R Markdown file (library(reproducibleRchunks)) and change the code chunk type from r to reproducibleR. It's that easy:

Installation

To install the package from CRAN, enter:

install.packages("reproducibleRchunks")

Or, install the latest developers' version:

devtools::install_github("brandmaier/reproducibleRchunks")

Demo

You can try it out yourselfadditional examples are available on our GitHub project page. Simply install the package and render the test.Rmd file to evaluate the reproducibility of its R code chunks. Each chunk will generate a reproducibility report. One chunk is intentionally designed to fail, showcasing how the package handles errors. Here's what to expect:

Step 1: Document is built for the first time:

First, all newly declared variables in an reproducibleR chunk are identified, their contents are fingerprinted, and the fingerprints are stored in a so-called JSON file.

Step 2: Document is re-built and automatically checked for reproducibility

All computational results are reproduced, fingerprinted and their fingerprints are compared against the fingerprints in the JSON storage. If results are identical, all is well, otherwise you will get a failure message.

Mechanics

The package executes reproducibleR code chunks as regular R code and gathers information about all variables that are newly declared in a given chunk. The contents of those variables are stored in a separate JSON data file (which is labeled according to the name of original Markdown file and the chunk label preceded by the prefix .repro and ending with the suffix .json). Once the document is regenerated and a corresponding JSON data files exist, their content is checked against the newly computed chunk variables for identity.

It is possible to either store the contents as fingerprints (default) or as plain content. Here is an example of how the contents of two objects are stored in plain format. In this example, there is a single variable called numbers with a vector of five numbers [0.874094, -1.6943659, -0.8961591, 1.00840087, 1.61713635] (rounded to a specified precision):

{json} { "type": "list", "attributes": { "names": { "type": "character", "attributes": {}, "value": ["numbers"] } }, "value": [ { "type": "double", "attributes": {}, "value": [0.874094, -1.6943659, -0.8961591, 1.00840087, 1.61713635] } ] }

For privacy reasons (and to save disk space), we actually do not store the raw data by default but only fingerprints of the data, which do not allow to reproduce the original data.

What kind of variables can be tested for reproducibility?

Virtually any kind of variable can be subjected to a reproducibility test by defining it within a reproducibleR code chunk in a R Markdown document, no matter whether it is a numeric result, a character string, or a more complex object such as the result from a call to t.test() (or any other statistical model):

Chunk Options

The package uses the standard rendering facilities of the knitr package and thus supports all standard code chunk options known from R Markdown documents, such as: - echo: show or hide the R code in the output - eval: evaluate the R code - include: FALSE hides both the code and the output - message: Show or hide messages - warning: Show or hide warnings - error: Show or hide errors

Further typical chunk options control output and formatting options (e.g., fig.width or fig.height).

Notes

Do not store critical and/or large data as raw data in reproducibleR chunks. In particular, do not store raw data (too large and possible breach of data protection laws, privacy issues), passwords (security risk as they would be stored in clear text), etc. Do not subject results from current date or current time functions as they are supposed to change over replications. Make sure to use random seeds if your analysis is based on random numbers and note that results from the default random number generator may vary between R versions.

Trouble-Shooting

  • Some variables that I use in my code chunk are not tested for reproducibility. Answer: Not all variables are subject to reproducibility checks but only those newly declared within a code chunk. This is a deliberate design decision.

License

The figures (in directory inst/img and man/figures of this repository) are all provided under Creative Commons 4.0 CC-BY license. All code is provided under the MIT license.

Owner

  • Name: Andreas Brandmaier
  • Login: brandmaier
  • Kind: user
  • Location: Berlin

Professor of Research Methods. Senior Research Scientist. Computer & Data Scientist in Lifespan Psychology.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite the following paper."
title: "reproducibleRchunks"
version: "1.2.0"
type: software
authors:
  - family-names: Brandmaier
    given-names: Andreas M.
  - family-names: Peikert
    given-names: Aaron
preferred-citation:
  type: article
  authors:
    - family-names: Brandmaier
      given-names: Andreas M.
    - family-names: Peikert
      given-names: Aaron
  title: "Automated Reproducibility Testing in R Markdown"
  journal: "Collabra: Psychology"
  volume: "11"
  issue: "1"
  pages: "138638"
  year: 2025
  doi: "10.1525/collabra.138638"

GitHub Events

Total
  • Create event: 12
  • Release event: 2
  • Issues event: 6
  • Watch event: 12
  • Issue comment event: 1
  • Push event: 53
  • Pull request event: 17
Last Year
  • Create event: 12
  • Release event: 2
  • Issues event: 6
  • Watch event: 12
  • Issue comment event: 1
  • Push event: 53
  • Pull request event: 17

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 10
  • Total pull requests: 17
  • Average time to close issues: 5 days
  • Average time to close pull requests: about 3 hours
  • Total issue authors: 3
  • Total pull request authors: 2
  • Average comments per issue: 0.3
  • Average comments per pull request: 0.06
  • Merged pull requests: 14
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 6
  • Pull requests: 16
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 minute
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 0.17
  • Average comments per pull request: 0.0
  • Merged pull requests: 13
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • brandmaier (7)
  • RodDalBen (2)
  • filippogambarota (1)
Pull Request Authors
  • brandmaier (16)
  • aaronpeikert (3)
Top Labels
Issue Labels
enhancement (2)
Pull Request Labels
codex (14)

Packages

  • Total packages: 1
  • Total downloads:
    • cran 275 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 4
  • Total maintainers: 1
cran.r-project.org: reproducibleRchunks

Automated Reproducibility Checks for R Markdown Documents

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 275 Last month
Rankings
Dependent packages count: 27.3%
Dependent repos count: 33.6%
Average: 49.2%
Downloads: 86.8%
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • digest * depends
  • jsonlite * depends
  • knitr * depends
  • testthat >= 3.0.0 suggests