https://github.com/assignuser/simstudy

simstudy: Illuminating research methods through data generation

https://github.com/assignuser/simstudy

Science Score: 41.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: joss.theoj.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.0%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

simstudy: Illuminating research methods through data generation

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 6
  • Releases: 0
Fork of kgoldfeld/simstudy
Created almost 5 years ago · Last pushed over 3 years ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.Rmd

---
title: "simstudy"
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```



[![R build status](https://github.com/kgoldfeld/simstudy/workflows/R-CMD-check/badge.svg?branch=main)](https://github.com/kgoldfeld/simstudy/actions){target="_blank"}
[![CRAN status](https://www.r-pkg.org/badges/version/simstudy)](https://CRAN.R-project.org/package=simstudy){target="_blank"}
[![status](https://joss.theoj.org/papers/10.21105/joss.02763/status.svg)](https://joss.theoj.org/papers/10.21105/joss.02763){target="_blank"}
[![CRAN downloads](https://cranlogs.r-pkg.org/badges/grand-total/simstudy)](https://CRAN.R-project.org/package=simstudy){target="_blank"}
[![codecov](https://app.codecov.io/gh/kgoldfeld/simstudy/branch/main/graph/badge.svg)](https://app.codecov.io/gh/kgoldfeld/simstudy){target="_blank"}
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html){target="_blank"}


The `simstudy` package is a collection of functions that allow users to generate simulated data sets in order to explore modeling techniques or better understand data generating processes. The user defines the distributions of individual variables, specifies relationships between covariates and outcomes, and generates data based on these specifications. The final data sets can represent randomized control trials, repeated measure designs, cluster randomized trials, or naturally observed data processes. Other complexities that can be added include survival data, correlated data, factorial study designs, step wedge designs, and missing data processes.

Simulation using `simstudy` has two fundamental steps. The user (1) **defines** the data elements of a data set and (2) **generates** the data based on these definitions. Additional functionality exists to simulate observed or randomized **treatment assignment/exposures**, to create **longitudinal/panel** data, to create **multi-level/hierarchical** data, to create datasets with **correlated variables** based on a specified covariance structure, to **merge** datasets, to create data sets with **missing** data, and to create non-linear relationships with underlying **spline** curves.

The overarching philosophy of `simstudy` is to create data generating processes that mimic the typical models used to fit those types of data. So, the parameterization of some of the data generating processes may not follow the standard parameterizations for the specific distributions. For example, in `simstudy` *gamma*-distributed data are generated based on the specification of a mean μ (or log(μ)) and a dispersion $d$, rather than shape α and rate β parameters that more typically characterize the *gamma* distribution. When we estimate the parameters, we are modeling μ (or some function of μ), so we should explicitly recover the `simstudy` parameters used to generate the model, thus illuminating the relationship between the underlying data generating processes and the models. For more details on the
package, use cases, examples, and function reference see the [documentation page](https://kgoldfeld.github.io/simstudy/articles/simstudy.html).


## Installation

You can install the released version of simstudy from [CRAN](https://CRAN.R-project.org){target="_blank"} with:

``` r
install.packages("simstudy")
```

And the development version from [GitHub](https://github.com/){target="_blank"} with:

``` r
# install.packages("devtools")
devtools::install_github("kgoldfeld/simstudy")
```
## Example

Here is some simple sample code, much more in the vignettes:

```{r, echo = TRUE}
library(simstudy)
set.seed(1965)

def <- defData(varname="x", formula = 10, variance = 2, dist = "normal")
def <- defData(def, varname="y", formula = "3 + 0.5 * x", variance = 1, dist = "normal")
dd <- genData(250, def)

dd <- trtAssign(dd, nTrt = 4, grpName = "grp", balanced = TRUE)

dd
```

## Contributing & Support

If you find a bug or need help, please file an issue with a [reprex](https://www.tidyverse.org/help/){target="_blank"} on [Github](https://github.com/kgoldfeld/simstudy/issues){target="_blank"}. We are happy to accept contributions to simstudy. More information on how to propose changes or fix bugs can be found [here](https://kgoldfeld.github.io/simstudy/CONTRIBUTING.html){target="_blank"}.

## Code of Conduct

Please note that the simstudy project is released with a [Contributor Code of Conduct](https://kgoldfeld.github.io/simstudy/CODE_OF_CONDUCT.html){target="_blank"}. By contributing to this project, you agree to abide by its terms.

Owner

  • Name: Jacob Wujciak-Jens
  • Login: assignUser
  • Kind: user
  • Location: Germany
  • Company: @voltrondata

Citation (CITATION.cff)

# -----------------------------------------------------------
# CITATION file created with {cffr} R package, v0.3.0
# See also: https://docs.ropensci.org/cffr/
# -----------------------------------------------------------
 
cff-version: 1.2.0
message: 'To cite package "simstudy" in publications use:'
type: software
license: GPL-3.0-only
title: 'simstudy: Simulation of Study Data'
version: 0.5.0.9000
doi: 10.21105/joss.02763
abstract: Simulates data sets in order to explore modeling techniques or better understand
  data generating processes. The user specifies a set of relationships between covariates,
  and generates data based on these specifications. The final data sets can represent
  data from randomized control trials, repeated measure (longitudinal) designs, and
  cluster randomized trials. Missingness can be generated using various mechanisms
  (MCAR, MAR, NMAR).
authors:
- family-names: Goldfeld
  given-names: Keith
  email: keith.goldfeld@nyulangone.org
  orcid: https://orcid.org/0000-0002-0292-8780
- family-names: Wujciak-Jens
  given-names: Jacob
  email: jacob@wujciak.de
  orcid: https://orcid.org/0000-0002-7281-3989
preferred-citation:
  type: article
  title: 'simstudy: Illuminating research methods through data generation'
  authors:
  - family-names: Goldfeld
    given-names: Keith
    email: keith.goldfeld@nyulangone.org
    orcid: https://orcid.org/0000-0002-0292-8780
  - family-names: Wujciak-Jens
    given-names: Jacob
    email: jacob@wujciak.de
    orcid: https://orcid.org/0000-0002-7281-3989
  publisher:
    name: The Open Journal
  journal: Journal of Open Source Software
  year: '2020'
  volume: '5'
  issue: '54'
  url: https://doi.org/10.21105/joss.02763
  doi: 10.21105/joss.02763
  start: '2763'
repository: https://CRAN.R-project.org/package=simstudy
repository-code: https://github.com/kgoldfeld/simstudy
url: https://kgoldfeld.github.io/simstudy/
date-released: '2022-07-08'
contact:
- family-names: Goldfeld
  given-names: Keith
  email: keith.goldfeld@nyulangone.org
  orcid: https://orcid.org/0000-0002-0292-8780
keywords:
- data-generation
- data-simulation
- r
- simulation
- statistical-models
references:
- type: software
  title: 'R: A Language and Environment for Statistical Computing'
  notes: Depends
  authors:
  - name: R Core Team
  location:
    name: Vienna, Austria
  year: '2022'
  url: https://www.R-project.org/
  institution:
    name: R Foundation for Statistical Computing
  version: '>= 3.3.0'
- type: software
  title: data.table
  abstract: 'data.table: Extension of `data.frame`'
  notes: Imports
  authors:
  - family-names: Dowle
    given-names: Matt
    email: mattjdowle@gmail.com
  - family-names: Srinivasan
    given-names: Arun
    email: asrini@pm.me
  year: '2022'
  url: https://CRAN.R-project.org/package=data.table
- type: software
  title: glue
  abstract: 'glue: Interpreted String Literals'
  notes: Imports
  authors:
  - family-names: Hester
    given-names: Jim
    orcid: https://orcid.org/0000-0002-2739-7082
  - family-names: Bryan
    given-names: Jennifer
    email: jenny@rstudio.com
    orcid: https://orcid.org/0000-0002-6983-2759
  year: '2022'
  url: https://CRAN.R-project.org/package=glue
- type: software
  title: methods
  abstract: 'R: A Language and Environment for Statistical Computing'
  notes: Imports
  authors:
  - name: R Core Team
  location:
    name: Vienna, Austria
  year: '2022'
  url: https://www.R-project.org/
  institution:
    name: R Foundation for Statistical Computing
- type: software
  title: mvnfast
  abstract: 'mvnfast: Fast Multivariate Normal and Student''s t Methods'
  notes: Imports
  authors:
  - family-names: Fasiolo
    given-names: Matteo
    email: matteo.fasiolo@gmail.com
  year: '2022'
  url: https://CRAN.R-project.org/package=mvnfast
- type: software
  title: mvtnorm
  abstract: 'mvtnorm: Multivariate Normal and t Distributions'
  notes: Imports
  authors:
  - family-names: Genz
    given-names: Alan
  - family-names: Bretz
    given-names: Frank
  - family-names: Miwa
    given-names: Tetsuhisa
  - family-names: Mi
    given-names: Xuefei
  - family-names: Hothorn
    given-names: Torsten
    email: Torsten.Hothorn@R-project.org
    orcid: https://orcid.org/0000-0001-8301-0471
  year: '2022'
  url: https://CRAN.R-project.org/package=mvtnorm
- type: software
  title: Rcpp
  abstract: 'Rcpp: Seamless R and C++ Integration'
  notes: Imports
  authors:
  - family-names: Eddelbuettel
    given-names: Dirk
  - family-names: Francois
    given-names: Romain
  - family-names: Allaire
    given-names: JJ
  - family-names: Ushey
    given-names: Kevin
  - family-names: Kou
    given-names: Qiang
  - family-names: Russell
    given-names: Nathan
  - family-names: Ucar
    given-names: Inaki
  - family-names: Bates
    given-names: Douglas
  - family-names: Chambers
    given-names: John
  year: '2022'
  url: https://CRAN.R-project.org/package=Rcpp
- type: software
  title: backports
  abstract: 'backports: Reimplementations of Functions Introduced Since R-3.0.0'
  notes: Imports
  authors:
  - family-names: Lang
    given-names: Michel
    email: michellang@gmail.com
    orcid: https://orcid.org/0000-0001-9754-0393
  - name: R Core Team
  year: '2022'
  url: https://CRAN.R-project.org/package=backports
- type: software
  title: covr
  abstract: 'covr: Test Coverage for Packages'
  notes: Suggests
  authors:
  - family-names: Hester
    given-names: Jim
    email: james.f.hester@gmail.com
  year: '2022'
  url: https://CRAN.R-project.org/package=covr
- type: software
  title: dplyr
  abstract: 'dplyr: A Grammar of Data Manipulation'
  notes: Suggests
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
    orcid: https://orcid.org/0000-0003-4757-117X
  - family-names: François
    given-names: Romain
    orcid: https://orcid.org/0000-0002-2444-4226
  - family-names: Henry
    given-names: Lionel
  - family-names: Müller
    given-names: Kirill
    orcid: https://orcid.org/0000-0002-1416-3412
  year: '2022'
  url: https://CRAN.R-project.org/package=dplyr
- type: software
  title: formatR
  abstract: 'formatR: Format R Code Automatically'
  notes: Suggests
  authors:
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  year: '2022'
  url: https://CRAN.R-project.org/package=formatR
- type: software
  title: gee
  abstract: 'gee: Generalized Estimation Equation Solver'
  notes: Suggests
  authors:
  - family-names: Carey
    given-names: Vincent J
  year: '2022'
  url: https://CRAN.R-project.org/package=gee
- type: software
  title: ggplot2
  abstract: 'ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics'
  notes: Suggests
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
    orcid: https://orcid.org/0000-0003-4757-117X
  - family-names: Chang
    given-names: Winston
    orcid: https://orcid.org/0000-0002-1576-2126
  - family-names: Henry
    given-names: Lionel
  - family-names: Pedersen
    given-names: Thomas Lin
    email: thomas.pedersen@rstudio.com
    orcid: https://orcid.org/0000-0002-5147-4711
  - family-names: Takahashi
    given-names: Kohske
  - family-names: Wilke
    given-names: Claus
    orcid: https://orcid.org/0000-0002-7470-9261
  - family-names: Woo
    given-names: Kara
    orcid: https://orcid.org/0000-0002-5125-4188
  - family-names: Yutani
    given-names: Hiroaki
    orcid: https://orcid.org/0000-0002-3385-7233
  - family-names: Dunnington
    given-names: Dewey
    orcid: https://orcid.org/0000-0002-9415-4582
  year: '2022'
  url: https://CRAN.R-project.org/package=ggplot2
- type: software
  title: grid
  abstract: 'R: A Language and Environment for Statistical Computing'
  notes: Suggests
  authors:
  - name: R Core Team
  location:
    name: Vienna, Austria
  year: '2022'
  url: https://www.R-project.org/
  institution:
    name: R Foundation for Statistical Computing
- type: software
  title: gridExtra
  abstract: 'gridExtra: Miscellaneous Functions for "Grid" Graphics'
  notes: Suggests
  authors:
  - family-names: Auguie
    given-names: Baptiste
    email: baptiste.auguie@gmail.com
  year: '2022'
  url: https://CRAN.R-project.org/package=gridExtra
- type: software
  title: hedgehog
  abstract: 'hedgehog: Property-Based Testing'
  notes: Suggests
  authors:
  - family-names: Campbell
    given-names: Huw
    email: huw.campbell@gmail.com
  year: '2022'
  url: https://CRAN.R-project.org/package=hedgehog
- type: software
  title: knitr
  abstract: 'knitr: A General-Purpose Package for Dynamic Report Generation in R'
  notes: Suggests
  authors:
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  year: '2022'
  url: https://CRAN.R-project.org/package=knitr
- type: software
  title: magrittr
  abstract: 'magrittr: A Forward-Pipe Operator for R'
  notes: Suggests
  authors:
  - family-names: Bache
    given-names: Stefan Milton
    email: stefan@stefanbache.dk
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2022'
  url: https://CRAN.R-project.org/package=magrittr
- type: software
  title: Matrix
  abstract: 'Matrix: Sparse and Dense Matrix Classes and Methods'
  notes: Suggests
  authors:
  - family-names: Bates
    given-names: Douglas
  - family-names: Maechler
    given-names: Martin
    email: mmaechler+Matrix@gmail.com
    orcid: https://orcid.org/0000-0002-8685-9910
  - family-names: Jagan
    given-names: Mikael
    orcid: https://orcid.org/0000-0002-3542-2938
  year: '2022'
  url: https://CRAN.R-project.org/package=Matrix
- type: software
  title: mgcv
  abstract: 'mgcv: Mixed GAM Computation Vehicle with Automatic Smoothness Estimation'
  notes: Suggests
  authors:
  - family-names: Wood
    given-names: Simon
    email: simon.wood@r-project.org
  year: '2022'
  url: https://CRAN.R-project.org/package=mgcv
- type: software
  title: ordinal
  abstract: 'ordinal: Regression Models for Ordinal Data'
  notes: Suggests
  authors:
  - family-names: Christensen
    given-names: Rune Haubo Bojesen
    email: rune.haubo@gmail.com
  year: '2022'
  url: https://CRAN.R-project.org/package=ordinal
- type: software
  title: pracma
  abstract: 'pracma: Practical Numerical Math Functions'
  notes: Suggests
  authors:
  - family-names: Borchers
    given-names: Hans W.
    email: hwborchers@googlemail.com
  year: '2022'
  url: https://CRAN.R-project.org/package=pracma
- type: software
  title: rmarkdown
  abstract: 'rmarkdown: Dynamic Documents for R'
  notes: Suggests
  authors:
  - family-names: Allaire
    given-names: JJ
    email: jj@rstudio.com
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  - family-names: McPherson
    given-names: Jonathan
    email: jonathan@rstudio.com
  - family-names: Luraschi
    given-names: Javier
    email: javier@rstudio.com
  - family-names: Ushey
    given-names: Kevin
    email: kevin@rstudio.com
  - family-names: Atkins
    given-names: Aron
    email: aron@rstudio.com
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  - family-names: Cheng
    given-names: Joe
    email: joe@rstudio.com
  - family-names: Chang
    given-names: Winston
    email: winston@rstudio.com
  - family-names: Iannone
    given-names: Richard
    email: rich@rstudio.com
    orcid: https://orcid.org/0000-0003-3925-190X
  year: '2022'
  url: https://CRAN.R-project.org/package=rmarkdown
- type: software
  title: scales
  abstract: 'scales: Scale Functions for Visualization'
  notes: Suggests
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  - family-names: Seidel
    given-names: Dana
  year: '2022'
  url: https://CRAN.R-project.org/package=scales
- type: software
  title: splines
  abstract: 'R: A Language and Environment for Statistical Computing'
  notes: Suggests
  authors:
  - name: R Core Team
  location:
    name: Vienna, Austria
  year: '2022'
  url: https://www.R-project.org/
  institution:
    name: R Foundation for Statistical Computing
- type: software
  title: survival
  abstract: 'survival: Survival Analysis'
  notes: Suggests
  authors:
  - family-names: Therneau
    given-names: Terry M
    email: therneau.terry@mayo.edu
  year: '2022'
  url: https://CRAN.R-project.org/package=survival
- type: software
  title: testthat
  abstract: 'testthat: Unit Testing for R'
  notes: Suggests
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2022'
  url: https://CRAN.R-project.org/package=testthat
- type: software
  title: gtsummary
  abstract: 'gtsummary: Presentation-Ready Data Summary and Analytic Result Tables'
  notes: Suggests
  authors:
  - family-names: Sjoberg
    given-names: Daniel D.
    email: danield.sjoberg@gmail.com
    orcid: https://orcid.org/0000-0003-0862-2018
  - family-names: Curry
    given-names: Michael
    orcid: https://orcid.org/0000-0002-0261-4044
  - family-names: Larmarange
    given-names: Joseph
    orcid: https://orcid.org/0000-0001-7097-700X
  - family-names: Lavery
    given-names: Jessica
    orcid: https://orcid.org/0000-0002-2746-5647
  - family-names: Whiting
    given-names: Karissa
    orcid: https://orcid.org/0000-0002-4683-1868
  - family-names: Zabor
    given-names: Emily C.
    orcid: https://orcid.org/0000-0002-1402-4498
  year: '2022'
  url: https://CRAN.R-project.org/package=gtsummary
- type: software
  title: survminer
  abstract: 'survminer: Drawing Survival Curves using ''ggplot2'''
  notes: Suggests
  authors:
  - family-names: Kassambara
    given-names: Alboukadel
    email: alboukadel.kassambara@gmail.com
  - family-names: Kosinski
    given-names: Marcin
  - family-names: Biecek
    given-names: Przemyslaw
    email: przemyslaw.biecek@gmail.com
  year: '2022'
  url: https://CRAN.R-project.org/package=survminer
identifiers:
- type: url
  value: https://kgoldfeld.github.io/simstudy/dev/

GitHub Events

Total
Last Year

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/cancel.yaml actions
  • styfle/cancel-workflow-action 0.9.1 composite
.github/workflows/covr.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pr-commands.yml actions
  • actions/checkout v2 composite
  • r-lib/actions/pr-fetch master composite
  • r-lib/actions/pr-fetch v2 composite
  • r-lib/actions/pr-push master composite
  • r-lib/actions/pr-push v2 composite
  • r-lib/actions/setup-r master composite
  • r-lib/actions/setup-r v2 composite
.github/workflows/readme.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/release-checks-manual.yml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/touchstone-comment.yaml actions
  • actions/github-script v3.1.0 composite
  • actions/github-script v3 composite
.github/workflows/touchstone-receive.yaml actions
  • actions/checkout v2 composite
  • actions/download-artifact v1 composite
  • actions/upload-artifact v1 composite
  • actions/upload-artifact v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/update-citation-cff.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-r v1 composite
  • r-lib/actions/setup-r-dependencies v1 composite
DESCRIPTION cran
  • R >= 3.3.0 depends
  • Rcpp * imports
  • backports * imports
  • data.table * imports
  • glue * imports
  • methods * imports
  • mvnfast * imports
  • mvtnorm * imports
  • Matrix * suggests
  • covr * suggests
  • dplyr * suggests
  • formatR * suggests
  • gee * suggests
  • ggplot2 * suggests
  • grid * suggests
  • gridExtra * suggests
  • gtsummary * suggests
  • hedgehog * suggests
  • knitr * suggests
  • magrittr * suggests
  • mgcv * suggests
  • ordinal * suggests
  • pracma * suggests
  • rmarkdown * suggests
  • scales * suggests
  • splines * suggests
  • survival * suggests
  • survminer * suggests
  • testthat * suggests