wizaRdry

A Magical Framework for Collaborative & Reproducible Data Analysis

https://github.com/belieflab/wizardry

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary

Last synced: 6 months ago · JSON representation

Repository

A Magical Framework for Collaborative & Reproducible Data Analysis

Basic Info

Host: GitHub
Owner: belieflab
License: other
Language: R
Default Branch: main
Homepage:
Size: 602 KB

Statistics

Stars: 1
Watchers: 0
Forks: 0
Open Issues: 1
Releases: 30

Created 12 months ago · Last pushed 6 months ago

Metadata Files

Readme Changelog License

wizaRdry

A Magical Framework for Collaborative & Reproducible Data Analysis

The wizaRdry package provides a comprehensive data analysis framework specifically designed for NIH-funded computational psychiatry, neuroscience, and psychology research with built-in NIH Data Archive (NDA) integration.

Installation

You can install the latest published version from CRAN with:

r install.packages("wizaRdry")

Alternativley, you can install the latest development version from GitHub:

r remove.packages("wizaRdry") rstudioapi::restartSession() install.packages("devtools") devtools::install_github("belieflab/wizaRdry")

Getting Started

After installation, follow these steps to set up your project:

1. Initialize Project Structure

Use the scry() function to create the necessary directory structure:

r library(wizaRdry) scry()

This will create a standard directory structure that looks like this:

. ├── clean │ ├── csv │ ├── mongo │ ├── qualtrics │ ├── redcap │ └── sql ├── nda │ ├── csv │ ├── mongo │ ├── qualtrics │ ├── redcap │ └── sql ├── tmp ├── .gitignore ├── config.yml ├── main.R ├── project.Rproj └── secrets.R

Each directory has a specific purpose in the wizaRdry workflow: - clean/ - Scripts for cleaning and processing raw data - nda/ - Scripts for preparing NDA submission templates - tmp/ - Temporary output files - Configuration files at the root level

2. Configure Secrets

Edit the generated secrets.R file to add your API credentials:

```r

REDCap

uri <- "https://your-redcap-instance.edu/api/" token <- "YOUR_TOKEN"

Qualtrics

apiKeys <- c("APIKEY1", "APIKEY2") baseUrls <- c("BASEURL1", "BASEURL2")

MongoDB

connectionString <- "mongodb://your-connection-string"

SQL

conn <- "BASE_URL" uid <- "USERNAME" pwd <- "PASSWORD" ```

3. Configure Study Settings

Edit the generated config.yml file to specify your study settings:

yaml default: study_alias: yourstudy identifier: src_subject_id mongo: database: ${study_alias} qualtrics: survey_ids: Institution1: survey_alias: "SV_QUALTRICS_ID" redcap: primary_key: record_id superkey: ndar_subject01 sql: primary_key: 'sub_id' superkey: 'phi' schemas: - ${study_alias} pii_fields: - 'name_first' - 'name_middle' - 'name_last'

4. Configure Missing Data Codes

Additionally, edit the generated config.yml file to specify missing data codes if they are used in your data.

You may use multiple types of codes (skipped, refused, missing, undefined) and multiple codes for each.

These values will be replaced by the missing data codes associated with each NDA Data Structure automatically when using nda()

yaml default: missing_data_codes: skipped: - -888 # Skip pattern/branching logic refused: - -9999 # Explicitly declined to answer (radio buttons) - -1 # Explicitly declined to answer (text boxes) missing: - -777 # Missing for unknown reasons undefined: - -555 # Otherwise undefined value

Features

Project scaffolding: Creates standard directory structures with scry()
Cross-modal data access: Unified interface to REDCap, MongoDB, Qualtrics and SQL (beta)
Memory-aware parallel processing: Automatically scales to available resources
Field harmonization: Standardizes data fields across platforms
NIH Data Archive integration: Prepares submissions for NDA compliance
Collaborative workflow: Enables multiple researchers to work from the same data source

Core Functions

wizaRdry provides a suite of functions organized by their purpose in the data workflow:

Project Setup

```r

Initialize project structure

scry() ```

Data Access

```r

data from REDCap

demoses01 <- redcap("demoses01")

Get data from Qualtrics

lshrs01 <- qualtrics("lshrs01")

Get data from MongoDB

prl01 <- mongo("prl01") ```

Data Cleaning

```r

Data Cleaning Workflow - run cleaning scripts and validation

clean("demo", "rgpts", "overfitting", csv = TRUE) ```

Data Cleaning

Cleaning scripts are written inside the clean/ directory and called by their script name (e.g., "demo" for demographics) in clean()

```r

Filter data

filtereddata <- sift(df, rows = c("sub001","sub002"), cols = c("srcsubject_id", "phenotype"))

Merge datasets

mergeddata <- meld(democlean, rgpts_clean)

Parse multi-survey datasets

rune("overfitting") ```

NDA Submission

```r

NDA Submission Workflow - prepare NDA templates

nda("demoses01", "lshrs01", "prl01") ```

Data Export

```r

Create CSV output

to.csv(df, "data_export")

Create R data file

to.rds(df, "data_export")

Create SPSS file

to.sav(df, "data_export") ```

Workflows

The wizaRdry package supports two distinct but complementary workflows:

1. Data Cleaning Workflow

This workflow focuses on accessing and cleaning raw data for analysis:

Place cleaning scripts in the clean/ directory
Scripts should be organized by data source: clean/qualtrics/, clean/redcap/, clean/mongo/
Name your cleaned datasets with the _clean suffix (e.g., rgpts_clean)
Access and process data with:

```r

Process data from multiple sources in one command

clean("rgpts", "wtar", "prl", csv = TRUE) ```

This runs your cleaning scripts, performs validation tests, and exports cleaned data.

2. NDA Submission Workflow

This workflow prepares data for NIH Data Archive submission:

Place NDA remediation scripts in the nda/ directory
Scripts should follow NDA structure naming: nda/qualtrics/rgpts01.R
NDA structure names typically end with a two-digit suffix (e.g., 01)
Process and validate NDA structures with:

```r

Prepare NDA submission templates

nda("rgpts01", "wtar01", "prl01", csv = TRUE) ```

This creates properly formatted NDA submission templates in the .nda/tmp directory.

Script Examples

Data Cleaning Script Example (clean/qualtrics/rgpts.R)

```r

Get raw data from Qualtrics

rgpts <- qualtrics("rgpts")

Cleaning process

rgpts$interviewdate <- as.Date(rgpts$interviewdate, "%m/%d/%Y") rgpts$srcsubjectid <- as.numeric(rgpts$srcsubjectid)

Calculate scores

rgpts$rgptstotal <- rowSums(rgpts[,grep("^rgptsq\d+$", names(rgpts))], na.rm = TRUE)

Final cleaned dataset

rgpts_clean <- rgpts ```

NDA Remediation Script Example (nda/qualtrics/rgpts01.R)

```r

Get data for NDA submission

rgpts01 <- qualtrics("rgpts")

Apply NDA standards

rgpts01$srcsubjectid <- as.character(rgpts01$srcsubjectid) rgpts01$interviewdate <- format(as.Date(rgpts01$interviewdate, "%m/%d/%Y"), "%m/%d/%Y")

Ensure NDA structure compliance

if (!"visit" %in% names(rgpts01)) { rgpts01$visit <- "baseline" }

Additional NDA-specific processing...

```

Citation

If you use wizaRdry in your research, please cite it:

Kenney, J., Williams, T., Pappu, M., Spilka, M., Pratt, D., Pokorny, V., Castiello de Obeso, S., Suthaharan, P., & Horgan, C. (2025). wizaRdry: A Framework For Collaborative & Reproducible Data Analysis. R package version 0.1.0. https://github.com/belieflab/wizaRdry

License

Test

Owner

Name: Belief Lab
Login: belieflab
Kind: user
Location: New Haven, CT
Company: @Yale

Website: https://belieflab.yale.edu
Repositories: 9
Profile: https://github.com/belieflab

GitHub Events

Total

Release event: 27
Watch event: 4
Delete event: 8
Public event: 1
Push event: 95
Pull request review event: 2
Pull request event: 13
Create event: 34

Last Year

Release event: 27
Watch event: 4
Delete event: 8
Public event: 1
Push event: 95
Pull request review event: 2
Pull request event: 13
Create event: 34

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 0
Total pull requests: 9
Average time to close issues: N/A
Average time to close pull requests: about 4 hours
Total issue authors: 0
Total pull request authors: 3
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 5
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 9
Average time to close issues: N/A
Average time to close pull requests: about 4 hours
Issue authors: 0
Pull request authors: 3
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 5
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

joshkenney (5)
christian-horgan (3)
jmmonday237 (1)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- cran 193 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 2
Total maintainers: 1

cran.r-project.org: wizaRdry

A Magical Framework for Collaborative & Reproducible Data Analysis

Homepage: https://github.com/belieflab/wizaRdry
Documentation: http://cran.r-project.org/web/packages/wizaRdry/wizaRdry.pdf
License: MIT + file LICENSE
Latest release: 0.2.6
published 9 months ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 193 Last month

Rankings

Dependent packages count: 26.7%

Dependent repos count: 32.9%

Average: 48.8%

Downloads: 86.7%

Maintainers (1)

joshua.kenney@yale.edu

Last synced: 6 months ago

Dependencies

.github/workflows/rhub.yaml actions

r-hub/actions/checkout v1 composite
r-hub/actions/platform-info v1 composite
r-hub/actions/run-check v1 composite
r-hub/actions/setup v1 composite
r-hub/actions/setup-deps v1 composite
r-hub/actions/setup-r v1 composite

DESCRIPTION cran

R6 * imports
REDCapR * imports
cli * imports
config * imports
dplyr * imports
future * imports
future.apply * imports
haven * imports
httr * imports
jsonlite * imports
knitr * imports
mongolite * imports
parallel * imports
qualtRics * imports
rlang * imports
stringdist * imports
testthat * imports
rmarkdown * suggests

wizaRdry

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

wizaRdry

A Magical Framework for Collaborative & Reproducible Data Analysis

Installation

Getting Started

1. Initialize Project Structure

2. Configure Secrets

REDCap

Qualtrics

MongoDB

SQL

3. Configure Study Settings

4. Configure Missing Data Codes

Features

Core Functions

Project Setup

Initialize project structure

Data Access

data from REDCap

Get data from Qualtrics

Get data from MongoDB

Data Cleaning

Data Cleaning Workflow - run cleaning scripts and validation

Data Cleaning

Filter data

Merge datasets

Parse multi-survey datasets

NDA Submission

NDA Submission Workflow - prepare NDA templates

Data Export

Create CSV output

Create R data file

Create SPSS file

Workflows

1. Data Cleaning Workflow

Process data from multiple sources in one command

2. NDA Submission Workflow

Prepare NDA submission templates

Script Examples

Data Cleaning Script Example (clean/qualtrics/rgpts.R)

Get raw data from Qualtrics

Cleaning process

Calculate scores

Final cleaned dataset

NDA Remediation Script Example (nda/qualtrics/rgpts01.R)

Get data for NDA submission

Apply NDA standards

Ensure NDA structure compliance

Additional NDA-specific processing...

Citation

License

Test

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: wizaRdry

Rankings

Maintainers (1)

Dependencies