wizaRdry

A Magical Framework for Collaborative & Reproducible Data Analysis

https://github.com/belieflab/wizardry

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

A Magical Framework for Collaborative & Reproducible Data Analysis

Basic Info
  • Host: GitHub
  • Owner: belieflab
  • License: other
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 602 KB
Statistics
  • Stars: 1
  • Watchers: 0
  • Forks: 0
  • Open Issues: 1
  • Releases: 30
Created 12 months ago · Last pushed 6 months ago
Metadata Files
Readme Changelog License

README.md

wizaRdry

CRAN status Lifecycle: stable <!-- badges: end -->

A Magical Framework for Collaborative & Reproducible Data Analysis

The wizaRdry package provides a comprehensive data analysis framework specifically designed for NIH-funded computational psychiatry, neuroscience, and psychology research with built-in NIH Data Archive (NDA) integration.

Installation

You can install the latest published version from CRAN with:

r install.packages("wizaRdry")

Alternativley, you can install the latest development version from GitHub:

r remove.packages("wizaRdry") rstudioapi::restartSession() install.packages("devtools") devtools::install_github("belieflab/wizaRdry")

Getting Started

After installation, follow these steps to set up your project:

1. Initialize Project Structure

Use the scry() function to create the necessary directory structure:

r library(wizaRdry) scry()

This will create a standard directory structure that looks like this:

. ├── clean │ ├── csv │ ├── mongo │ ├── qualtrics │ ├── redcap │ └── sql ├── nda │ ├── csv │ ├── mongo │ ├── qualtrics │ ├── redcap │ └── sql ├── tmp ├── .gitignore ├── config.yml ├── main.R ├── project.Rproj └── secrets.R

Each directory has a specific purpose in the wizaRdry workflow: - clean/ - Scripts for cleaning and processing raw data - nda/ - Scripts for preparing NDA submission templates - tmp/ - Temporary output files - Configuration files at the root level

2. Configure Secrets

Edit the generated secrets.R file to add your API credentials:

```r

REDCap

uri <- "https://your-redcap-instance.edu/api/" token <- "YOUR_TOKEN"

Qualtrics

apiKeys <- c("APIKEY1", "APIKEY2") baseUrls <- c("BASEURL1", "BASEURL2")

MongoDB

connectionString <- "mongodb://your-connection-string"

SQL

conn <- "BASE_URL" uid <- "USERNAME" pwd <- "PASSWORD" ```

3. Configure Study Settings

Edit the generated config.yml file to specify your study settings:

yaml default: study_alias: yourstudy identifier: src_subject_id mongo: database: ${study_alias} qualtrics: survey_ids: Institution1: survey_alias: "SV_QUALTRICS_ID" redcap: primary_key: record_id superkey: ndar_subject01 sql: primary_key: 'sub_id' superkey: 'phi' schemas: - ${study_alias} pii_fields: - 'name_first' - 'name_middle' - 'name_last'

4. Configure Missing Data Codes

Additionally, edit the generated config.yml file to specify missing data codes if they are used in your data.

You may use multiple types of codes (skipped, refused, missing, undefined) and multiple codes for each.

These values will be replaced by the missing data codes associated with each NDA Data Structure automatically when using nda()

yaml default: missing_data_codes: skipped: - -888 # Skip pattern/branching logic refused: - -9999 # Explicitly declined to answer (radio buttons) - -1 # Explicitly declined to answer (text boxes) missing: - -777 # Missing for unknown reasons undefined: - -555 # Otherwise undefined value

Features

  • Project scaffolding: Creates standard directory structures with scry()
  • Cross-modal data access: Unified interface to REDCap, MongoDB, Qualtrics and SQL (beta)
  • Memory-aware parallel processing: Automatically scales to available resources
  • Field harmonization: Standardizes data fields across platforms
  • NIH Data Archive integration: Prepares submissions for NDA compliance
  • Collaborative workflow: Enables multiple researchers to work from the same data source

Core Functions

wizaRdry provides a suite of functions organized by their purpose in the data workflow:

Project Setup

```r

Initialize project structure

scry() ```

Data Access

```r

data from REDCap

demoses01 <- redcap("demoses01")

Get data from Qualtrics

lshrs01 <- qualtrics("lshrs01")

Get data from MongoDB

prl01 <- mongo("prl01") ```

Data Cleaning

```r

Data Cleaning Workflow - run cleaning scripts and validation

clean("demo", "rgpts", "overfitting", csv = TRUE) ```

Data Cleaning

Cleaning scripts are written inside the clean/ directory and called by their script name (e.g., "demo" for demographics) in clean()

```r

Filter data

filtereddata <- sift(df, rows = c("sub001","sub002"), cols = c("srcsubject_id", "phenotype"))

Merge datasets

mergeddata <- meld(democlean, rgpts_clean)

Parse multi-survey datasets

rune("overfitting") ```

NDA Submission

```r

NDA Submission Workflow - prepare NDA templates

nda("demoses01", "lshrs01", "prl01") ```

Data Export

```r

Create CSV output

to.csv(df, "data_export")

Create R data file

to.rds(df, "data_export")

Create SPSS file

to.sav(df, "data_export") ```

Workflows

The wizaRdry package supports two distinct but complementary workflows:

1. Data Cleaning Workflow

This workflow focuses on accessing and cleaning raw data for analysis:

  • Place cleaning scripts in the clean/ directory
  • Scripts should be organized by data source: clean/qualtrics/, clean/redcap/, clean/mongo/
  • Name your cleaned datasets with the _clean suffix (e.g., rgpts_clean)
  • Access and process data with:

```r

Process data from multiple sources in one command

clean("rgpts", "wtar", "prl", csv = TRUE) ```

This runs your cleaning scripts, performs validation tests, and exports cleaned data.

2. NDA Submission Workflow

This workflow prepares data for NIH Data Archive submission:

  • Place NDA remediation scripts in the nda/ directory
  • Scripts should follow NDA structure naming: nda/qualtrics/rgpts01.R
  • NDA structure names typically end with a two-digit suffix (e.g., 01)
  • Process and validate NDA structures with:

```r

Prepare NDA submission templates

nda("rgpts01", "wtar01", "prl01", csv = TRUE) ```

This creates properly formatted NDA submission templates in the .nda/tmp directory.

Script Examples

Data Cleaning Script Example (clean/qualtrics/rgpts.R)

```r

Get raw data from Qualtrics

rgpts <- qualtrics("rgpts")

Cleaning process

rgpts$interviewdate <- as.Date(rgpts$interviewdate, "%m/%d/%Y") rgpts$srcsubjectid <- as.numeric(rgpts$srcsubjectid)

Calculate scores

rgpts$rgptstotal <- rowSums(rgpts[,grep("^rgptsq\d+$", names(rgpts))], na.rm = TRUE)

Final cleaned dataset

rgpts_clean <- rgpts ```

NDA Remediation Script Example (nda/qualtrics/rgpts01.R)

```r

Get data for NDA submission

rgpts01 <- qualtrics("rgpts")

Apply NDA standards

rgpts01$srcsubjectid <- as.character(rgpts01$srcsubjectid) rgpts01$interviewdate <- format(as.Date(rgpts01$interviewdate, "%m/%d/%Y"), "%m/%d/%Y")

Ensure NDA structure compliance

if (!"visit" %in% names(rgpts01)) { rgpts01$visit <- "baseline" }

Additional NDA-specific processing...

```

Citation

If you use wizaRdry in your research, please cite it:

Kenney, J., Williams, T., Pappu, M., Spilka, M., Pratt, D., Pokorny, V., Castiello de Obeso, S., Suthaharan, P., & Horgan, C. (2025). wizaRdry: A Framework For Collaborative & Reproducible Data Analysis. R package version 0.1.0. https://github.com/belieflab/wizaRdry

License

MIT © Joshua Kenney

Test

Owner

  • Name: Belief Lab
  • Login: belieflab
  • Kind: user
  • Location: New Haven, CT
  • Company: @Yale

GitHub Events

Total
  • Release event: 27
  • Watch event: 4
  • Delete event: 8
  • Public event: 1
  • Push event: 95
  • Pull request review event: 2
  • Pull request event: 13
  • Create event: 34
Last Year
  • Release event: 27
  • Watch event: 4
  • Delete event: 8
  • Public event: 1
  • Push event: 95
  • Pull request review event: 2
  • Pull request event: 13
  • Create event: 34

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 9
  • Average time to close issues: N/A
  • Average time to close pull requests: about 4 hours
  • Total issue authors: 0
  • Total pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 9
  • Average time to close issues: N/A
  • Average time to close pull requests: about 4 hours
  • Issue authors: 0
  • Pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • joshkenney (5)
  • christian-horgan (3)
  • jmmonday237 (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 193 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 2
  • Total maintainers: 1
cran.r-project.org: wizaRdry

A Magical Framework for Collaborative & Reproducible Data Analysis

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 193 Last month
Rankings
Dependent packages count: 26.7%
Dependent repos count: 32.9%
Average: 48.8%
Downloads: 86.7%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/rhub.yaml actions
  • r-hub/actions/checkout v1 composite
  • r-hub/actions/platform-info v1 composite
  • r-hub/actions/run-check v1 composite
  • r-hub/actions/setup v1 composite
  • r-hub/actions/setup-deps v1 composite
  • r-hub/actions/setup-r v1 composite
DESCRIPTION cran
  • R6 * imports
  • REDCapR * imports
  • cli * imports
  • config * imports
  • dplyr * imports
  • future * imports
  • future.apply * imports
  • haven * imports
  • httr * imports
  • jsonlite * imports
  • knitr * imports
  • mongolite * imports
  • parallel * imports
  • qualtRics * imports
  • rlang * imports
  • stringdist * imports
  • testthat * imports
  • rmarkdown * suggests