https://github.com/assignuser/simstudy
simstudy: Illuminating research methods through data generation
Science Score: 41.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: joss.theoj.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (18.0%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
·
Repository
simstudy: Illuminating research methods through data generation
Basic Info
- Host: GitHub
- Owner: assignUser
- License: gpl-3.0
- Language: R
- Default Branch: main
- Homepage: https://kgoldfeld.github.io/simstudy/
- Size: 21.4 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 6
- Releases: 0
Fork of kgoldfeld/simstudy
Created almost 5 years ago
· Last pushed over 3 years ago
Metadata Files
Readme
Contributing
License
Code of conduct
Citation
README.Rmd
---
title: "simstudy"
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
[](https://github.com/kgoldfeld/simstudy/actions){target="_blank"}
[](https://CRAN.R-project.org/package=simstudy){target="_blank"}
[](https://joss.theoj.org/papers/10.21105/joss.02763){target="_blank"}
[](https://CRAN.R-project.org/package=simstudy){target="_blank"}
[](https://app.codecov.io/gh/kgoldfeld/simstudy){target="_blank"}
[](https://lifecycle.r-lib.org/articles/stages.html){target="_blank"}
The `simstudy` package is a collection of functions that allow users to generate simulated data sets in order to explore modeling techniques or better understand data generating processes. The user defines the distributions of individual variables, specifies relationships between covariates and outcomes, and generates data based on these specifications. The final data sets can represent randomized control trials, repeated measure designs, cluster randomized trials, or naturally observed data processes. Other complexities that can be added include survival data, correlated data, factorial study designs, step wedge designs, and missing data processes.
Simulation using `simstudy` has two fundamental steps. The user (1) **defines** the data elements of a data set and (2) **generates** the data based on these definitions. Additional functionality exists to simulate observed or randomized **treatment assignment/exposures**, to create **longitudinal/panel** data, to create **multi-level/hierarchical** data, to create datasets with **correlated variables** based on a specified covariance structure, to **merge** datasets, to create data sets with **missing** data, and to create non-linear relationships with underlying **spline** curves.
The overarching philosophy of `simstudy` is to create data generating processes that mimic the typical models used to fit those types of data. So, the parameterization of some of the data generating processes may not follow the standard parameterizations for the specific distributions. For example, in `simstudy` *gamma*-distributed data are generated based on the specification of a mean μ (or log(μ)) and a dispersion $d$, rather than shape α and rate β parameters that more typically characterize the *gamma* distribution. When we estimate the parameters, we are modeling μ (or some function of μ), so we should explicitly recover the `simstudy` parameters used to generate the model, thus illuminating the relationship between the underlying data generating processes and the models. For more details on the
package, use cases, examples, and function reference see the [documentation page](https://kgoldfeld.github.io/simstudy/articles/simstudy.html).
## Installation
You can install the released version of simstudy from [CRAN](https://CRAN.R-project.org){target="_blank"} with:
``` r
install.packages("simstudy")
```
And the development version from [GitHub](https://github.com/){target="_blank"} with:
``` r
# install.packages("devtools")
devtools::install_github("kgoldfeld/simstudy")
```
## Example
Here is some simple sample code, much more in the vignettes:
```{r, echo = TRUE}
library(simstudy)
set.seed(1965)
def <- defData(varname="x", formula = 10, variance = 2, dist = "normal")
def <- defData(def, varname="y", formula = "3 + 0.5 * x", variance = 1, dist = "normal")
dd <- genData(250, def)
dd <- trtAssign(dd, nTrt = 4, grpName = "grp", balanced = TRUE)
dd
```
## Contributing & Support
If you find a bug or need help, please file an issue with a [reprex](https://www.tidyverse.org/help/){target="_blank"} on [Github](https://github.com/kgoldfeld/simstudy/issues){target="_blank"}. We are happy to accept contributions to simstudy. More information on how to propose changes or fix bugs can be found [here](https://kgoldfeld.github.io/simstudy/CONTRIBUTING.html){target="_blank"}.
## Code of Conduct
Please note that the simstudy project is released with a [Contributor Code of Conduct](https://kgoldfeld.github.io/simstudy/CODE_OF_CONDUCT.html){target="_blank"}. By contributing to this project, you agree to abide by its terms.
Owner
- Name: Jacob Wujciak-Jens
- Login: assignUser
- Kind: user
- Location: Germany
- Company: @voltrondata
- Repositories: 11
- Profile: https://github.com/assignUser
Citation (CITATION.cff)
# -----------------------------------------------------------
# CITATION file created with {cffr} R package, v0.3.0
# See also: https://docs.ropensci.org/cffr/
# -----------------------------------------------------------
cff-version: 1.2.0
message: 'To cite package "simstudy" in publications use:'
type: software
license: GPL-3.0-only
title: 'simstudy: Simulation of Study Data'
version: 0.5.0.9000
doi: 10.21105/joss.02763
abstract: Simulates data sets in order to explore modeling techniques or better understand
data generating processes. The user specifies a set of relationships between covariates,
and generates data based on these specifications. The final data sets can represent
data from randomized control trials, repeated measure (longitudinal) designs, and
cluster randomized trials. Missingness can be generated using various mechanisms
(MCAR, MAR, NMAR).
authors:
- family-names: Goldfeld
given-names: Keith
email: keith.goldfeld@nyulangone.org
orcid: https://orcid.org/0000-0002-0292-8780
- family-names: Wujciak-Jens
given-names: Jacob
email: jacob@wujciak.de
orcid: https://orcid.org/0000-0002-7281-3989
preferred-citation:
type: article
title: 'simstudy: Illuminating research methods through data generation'
authors:
- family-names: Goldfeld
given-names: Keith
email: keith.goldfeld@nyulangone.org
orcid: https://orcid.org/0000-0002-0292-8780
- family-names: Wujciak-Jens
given-names: Jacob
email: jacob@wujciak.de
orcid: https://orcid.org/0000-0002-7281-3989
publisher:
name: The Open Journal
journal: Journal of Open Source Software
year: '2020'
volume: '5'
issue: '54'
url: https://doi.org/10.21105/joss.02763
doi: 10.21105/joss.02763
start: '2763'
repository: https://CRAN.R-project.org/package=simstudy
repository-code: https://github.com/kgoldfeld/simstudy
url: https://kgoldfeld.github.io/simstudy/
date-released: '2022-07-08'
contact:
- family-names: Goldfeld
given-names: Keith
email: keith.goldfeld@nyulangone.org
orcid: https://orcid.org/0000-0002-0292-8780
keywords:
- data-generation
- data-simulation
- r
- simulation
- statistical-models
references:
- type: software
title: 'R: A Language and Environment for Statistical Computing'
notes: Depends
authors:
- name: R Core Team
location:
name: Vienna, Austria
year: '2022'
url: https://www.R-project.org/
institution:
name: R Foundation for Statistical Computing
version: '>= 3.3.0'
- type: software
title: data.table
abstract: 'data.table: Extension of `data.frame`'
notes: Imports
authors:
- family-names: Dowle
given-names: Matt
email: mattjdowle@gmail.com
- family-names: Srinivasan
given-names: Arun
email: asrini@pm.me
year: '2022'
url: https://CRAN.R-project.org/package=data.table
- type: software
title: glue
abstract: 'glue: Interpreted String Literals'
notes: Imports
authors:
- family-names: Hester
given-names: Jim
orcid: https://orcid.org/0000-0002-2739-7082
- family-names: Bryan
given-names: Jennifer
email: jenny@rstudio.com
orcid: https://orcid.org/0000-0002-6983-2759
year: '2022'
url: https://CRAN.R-project.org/package=glue
- type: software
title: methods
abstract: 'R: A Language and Environment for Statistical Computing'
notes: Imports
authors:
- name: R Core Team
location:
name: Vienna, Austria
year: '2022'
url: https://www.R-project.org/
institution:
name: R Foundation for Statistical Computing
- type: software
title: mvnfast
abstract: 'mvnfast: Fast Multivariate Normal and Student''s t Methods'
notes: Imports
authors:
- family-names: Fasiolo
given-names: Matteo
email: matteo.fasiolo@gmail.com
year: '2022'
url: https://CRAN.R-project.org/package=mvnfast
- type: software
title: mvtnorm
abstract: 'mvtnorm: Multivariate Normal and t Distributions'
notes: Imports
authors:
- family-names: Genz
given-names: Alan
- family-names: Bretz
given-names: Frank
- family-names: Miwa
given-names: Tetsuhisa
- family-names: Mi
given-names: Xuefei
- family-names: Hothorn
given-names: Torsten
email: Torsten.Hothorn@R-project.org
orcid: https://orcid.org/0000-0001-8301-0471
year: '2022'
url: https://CRAN.R-project.org/package=mvtnorm
- type: software
title: Rcpp
abstract: 'Rcpp: Seamless R and C++ Integration'
notes: Imports
authors:
- family-names: Eddelbuettel
given-names: Dirk
- family-names: Francois
given-names: Romain
- family-names: Allaire
given-names: JJ
- family-names: Ushey
given-names: Kevin
- family-names: Kou
given-names: Qiang
- family-names: Russell
given-names: Nathan
- family-names: Ucar
given-names: Inaki
- family-names: Bates
given-names: Douglas
- family-names: Chambers
given-names: John
year: '2022'
url: https://CRAN.R-project.org/package=Rcpp
- type: software
title: backports
abstract: 'backports: Reimplementations of Functions Introduced Since R-3.0.0'
notes: Imports
authors:
- family-names: Lang
given-names: Michel
email: michellang@gmail.com
orcid: https://orcid.org/0000-0001-9754-0393
- name: R Core Team
year: '2022'
url: https://CRAN.R-project.org/package=backports
- type: software
title: covr
abstract: 'covr: Test Coverage for Packages'
notes: Suggests
authors:
- family-names: Hester
given-names: Jim
email: james.f.hester@gmail.com
year: '2022'
url: https://CRAN.R-project.org/package=covr
- type: software
title: dplyr
abstract: 'dplyr: A Grammar of Data Manipulation'
notes: Suggests
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
orcid: https://orcid.org/0000-0003-4757-117X
- family-names: François
given-names: Romain
orcid: https://orcid.org/0000-0002-2444-4226
- family-names: Henry
given-names: Lionel
- family-names: Müller
given-names: Kirill
orcid: https://orcid.org/0000-0002-1416-3412
year: '2022'
url: https://CRAN.R-project.org/package=dplyr
- type: software
title: formatR
abstract: 'formatR: Format R Code Automatically'
notes: Suggests
authors:
- family-names: Xie
given-names: Yihui
email: xie@yihui.name
orcid: https://orcid.org/0000-0003-0645-5666
year: '2022'
url: https://CRAN.R-project.org/package=formatR
- type: software
title: gee
abstract: 'gee: Generalized Estimation Equation Solver'
notes: Suggests
authors:
- family-names: Carey
given-names: Vincent J
year: '2022'
url: https://CRAN.R-project.org/package=gee
- type: software
title: ggplot2
abstract: 'ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics'
notes: Suggests
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
orcid: https://orcid.org/0000-0003-4757-117X
- family-names: Chang
given-names: Winston
orcid: https://orcid.org/0000-0002-1576-2126
- family-names: Henry
given-names: Lionel
- family-names: Pedersen
given-names: Thomas Lin
email: thomas.pedersen@rstudio.com
orcid: https://orcid.org/0000-0002-5147-4711
- family-names: Takahashi
given-names: Kohske
- family-names: Wilke
given-names: Claus
orcid: https://orcid.org/0000-0002-7470-9261
- family-names: Woo
given-names: Kara
orcid: https://orcid.org/0000-0002-5125-4188
- family-names: Yutani
given-names: Hiroaki
orcid: https://orcid.org/0000-0002-3385-7233
- family-names: Dunnington
given-names: Dewey
orcid: https://orcid.org/0000-0002-9415-4582
year: '2022'
url: https://CRAN.R-project.org/package=ggplot2
- type: software
title: grid
abstract: 'R: A Language and Environment for Statistical Computing'
notes: Suggests
authors:
- name: R Core Team
location:
name: Vienna, Austria
year: '2022'
url: https://www.R-project.org/
institution:
name: R Foundation for Statistical Computing
- type: software
title: gridExtra
abstract: 'gridExtra: Miscellaneous Functions for "Grid" Graphics'
notes: Suggests
authors:
- family-names: Auguie
given-names: Baptiste
email: baptiste.auguie@gmail.com
year: '2022'
url: https://CRAN.R-project.org/package=gridExtra
- type: software
title: hedgehog
abstract: 'hedgehog: Property-Based Testing'
notes: Suggests
authors:
- family-names: Campbell
given-names: Huw
email: huw.campbell@gmail.com
year: '2022'
url: https://CRAN.R-project.org/package=hedgehog
- type: software
title: knitr
abstract: 'knitr: A General-Purpose Package for Dynamic Report Generation in R'
notes: Suggests
authors:
- family-names: Xie
given-names: Yihui
email: xie@yihui.name
orcid: https://orcid.org/0000-0003-0645-5666
year: '2022'
url: https://CRAN.R-project.org/package=knitr
- type: software
title: magrittr
abstract: 'magrittr: A Forward-Pipe Operator for R'
notes: Suggests
authors:
- family-names: Bache
given-names: Stefan Milton
email: stefan@stefanbache.dk
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
year: '2022'
url: https://CRAN.R-project.org/package=magrittr
- type: software
title: Matrix
abstract: 'Matrix: Sparse and Dense Matrix Classes and Methods'
notes: Suggests
authors:
- family-names: Bates
given-names: Douglas
- family-names: Maechler
given-names: Martin
email: mmaechler+Matrix@gmail.com
orcid: https://orcid.org/0000-0002-8685-9910
- family-names: Jagan
given-names: Mikael
orcid: https://orcid.org/0000-0002-3542-2938
year: '2022'
url: https://CRAN.R-project.org/package=Matrix
- type: software
title: mgcv
abstract: 'mgcv: Mixed GAM Computation Vehicle with Automatic Smoothness Estimation'
notes: Suggests
authors:
- family-names: Wood
given-names: Simon
email: simon.wood@r-project.org
year: '2022'
url: https://CRAN.R-project.org/package=mgcv
- type: software
title: ordinal
abstract: 'ordinal: Regression Models for Ordinal Data'
notes: Suggests
authors:
- family-names: Christensen
given-names: Rune Haubo Bojesen
email: rune.haubo@gmail.com
year: '2022'
url: https://CRAN.R-project.org/package=ordinal
- type: software
title: pracma
abstract: 'pracma: Practical Numerical Math Functions'
notes: Suggests
authors:
- family-names: Borchers
given-names: Hans W.
email: hwborchers@googlemail.com
year: '2022'
url: https://CRAN.R-project.org/package=pracma
- type: software
title: rmarkdown
abstract: 'rmarkdown: Dynamic Documents for R'
notes: Suggests
authors:
- family-names: Allaire
given-names: JJ
email: jj@rstudio.com
- family-names: Xie
given-names: Yihui
email: xie@yihui.name
orcid: https://orcid.org/0000-0003-0645-5666
- family-names: McPherson
given-names: Jonathan
email: jonathan@rstudio.com
- family-names: Luraschi
given-names: Javier
email: javier@rstudio.com
- family-names: Ushey
given-names: Kevin
email: kevin@rstudio.com
- family-names: Atkins
given-names: Aron
email: aron@rstudio.com
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
- family-names: Cheng
given-names: Joe
email: joe@rstudio.com
- family-names: Chang
given-names: Winston
email: winston@rstudio.com
- family-names: Iannone
given-names: Richard
email: rich@rstudio.com
orcid: https://orcid.org/0000-0003-3925-190X
year: '2022'
url: https://CRAN.R-project.org/package=rmarkdown
- type: software
title: scales
abstract: 'scales: Scale Functions for Visualization'
notes: Suggests
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
- family-names: Seidel
given-names: Dana
year: '2022'
url: https://CRAN.R-project.org/package=scales
- type: software
title: splines
abstract: 'R: A Language and Environment for Statistical Computing'
notes: Suggests
authors:
- name: R Core Team
location:
name: Vienna, Austria
year: '2022'
url: https://www.R-project.org/
institution:
name: R Foundation for Statistical Computing
- type: software
title: survival
abstract: 'survival: Survival Analysis'
notes: Suggests
authors:
- family-names: Therneau
given-names: Terry M
email: therneau.terry@mayo.edu
year: '2022'
url: https://CRAN.R-project.org/package=survival
- type: software
title: testthat
abstract: 'testthat: Unit Testing for R'
notes: Suggests
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@rstudio.com
year: '2022'
url: https://CRAN.R-project.org/package=testthat
- type: software
title: gtsummary
abstract: 'gtsummary: Presentation-Ready Data Summary and Analytic Result Tables'
notes: Suggests
authors:
- family-names: Sjoberg
given-names: Daniel D.
email: danield.sjoberg@gmail.com
orcid: https://orcid.org/0000-0003-0862-2018
- family-names: Curry
given-names: Michael
orcid: https://orcid.org/0000-0002-0261-4044
- family-names: Larmarange
given-names: Joseph
orcid: https://orcid.org/0000-0001-7097-700X
- family-names: Lavery
given-names: Jessica
orcid: https://orcid.org/0000-0002-2746-5647
- family-names: Whiting
given-names: Karissa
orcid: https://orcid.org/0000-0002-4683-1868
- family-names: Zabor
given-names: Emily C.
orcid: https://orcid.org/0000-0002-1402-4498
year: '2022'
url: https://CRAN.R-project.org/package=gtsummary
- type: software
title: survminer
abstract: 'survminer: Drawing Survival Curves using ''ggplot2'''
notes: Suggests
authors:
- family-names: Kassambara
given-names: Alboukadel
email: alboukadel.kassambara@gmail.com
- family-names: Kosinski
given-names: Marcin
- family-names: Biecek
given-names: Przemyslaw
email: przemyslaw.biecek@gmail.com
year: '2022'
url: https://CRAN.R-project.org/package=survminer
identifiers:
- type: url
value: https://kgoldfeld.github.io/simstudy/dev/
GitHub Events
Total
Last Year
Dependencies
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v2 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/cancel.yaml
actions
- styfle/cancel-workflow-action 0.9.1 composite
.github/workflows/covr.yaml
actions
- actions/checkout v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml
actions
- actions/checkout v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pr-commands.yml
actions
- actions/checkout v2 composite
- r-lib/actions/pr-fetch master composite
- r-lib/actions/pr-fetch v2 composite
- r-lib/actions/pr-push master composite
- r-lib/actions/pr-push v2 composite
- r-lib/actions/setup-r master composite
- r-lib/actions/setup-r v2 composite
.github/workflows/readme.yaml
actions
- actions/checkout v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/release-checks-manual.yml
actions
- actions/checkout v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/touchstone-comment.yaml
actions
- actions/github-script v3.1.0 composite
- actions/github-script v3 composite
.github/workflows/touchstone-receive.yaml
actions
- actions/checkout v2 composite
- actions/download-artifact v1 composite
- actions/upload-artifact v1 composite
- actions/upload-artifact v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/update-citation-cff.yaml
actions
- actions/checkout v2 composite
- r-lib/actions/setup-r v1 composite
- r-lib/actions/setup-r-dependencies v1 composite
DESCRIPTION
cran
- R >= 3.3.0 depends
- Rcpp * imports
- backports * imports
- data.table * imports
- glue * imports
- methods * imports
- mvnfast * imports
- mvtnorm * imports
- Matrix * suggests
- covr * suggests
- dplyr * suggests
- formatR * suggests
- gee * suggests
- ggplot2 * suggests
- grid * suggests
- gridExtra * suggests
- gtsummary * suggests
- hedgehog * suggests
- knitr * suggests
- magrittr * suggests
- mgcv * suggests
- ordinal * suggests
- pracma * suggests
- rmarkdown * suggests
- scales * suggests
- splines * suggests
- survival * suggests
- survminer * suggests
- testthat * suggests