simDAG

An R-Package to Simulate Simple and Complex (longitudinal) Data from a DAG and Associated Node Information

https://github.com/robindenz1/simdag

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (21.5%) to scientific vocabulary

Keywords

causal-inference directed-acyclic-graph simulation
Last synced: 4 months ago · JSON representation

Repository

An R-Package to Simulate Simple and Complex (longitudinal) Data from a DAG and Associated Node Information

Basic Info
Statistics
  • Stars: 11
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 10
Topics
causal-inference directed-acyclic-graph simulation
Created over 3 years ago · Last pushed 4 months ago
Metadata Files
Readme Changelog License Codemeta

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```


[![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![](https://www.r-pkg.org/badges/version/simDAG?color=green)](https://cran.r-project.org/package=simDAG)
[![](http://cranlogs.r-pkg.org/badges/grand-total/simDAG?color=blue)](https://cran.r-project.org/package=simDAG)
[![R-CMD-check](https://github.com/RobinDenz1/simDAG/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/RobinDenz1/simDAG/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/RobinDenz1/simDAG/graph/badge.svg)](https://app.codecov.io/gh/RobinDenz1/simDAG)
[![](https://img.shields.io/badge/doi-10.48550/arXiv.2506.01498-green.svg)](https://doi.org/10.48550/arXiv.2506.01498)


# simDAG 

Author: Robin Denz

## Description

`simDAG` is an R-Package which can be used to generate data from a known directed acyclic graph (DAG) with associated information on distributions and causal coefficients. The root nodes are sampled first and each subsequent child node is generated according to a regression model (linear, logistic, multinomial, cox, ...) or other function. The result is a dataset that has the same causal structure as the specified DAG and by expectation the same distributions and coefficients as initially specified. It also implements a comprehensive framework for conducting discrete-time simulations in a similar fashion.

## Installation

A stable version of this package can be installed from CRAN:

```R
install.packages("simDAG")
```

and the developmental version may be installed from github using the `remotes` R-Package:

```R
library(remotes)

remotes::install_github("RobinDenz1/simDAG")
```

## Bug Reports and Feature Requests

If you encounter any bugs or have any specific feature requests, please file an [Issue](https://github.com/RobinDenz1/simDAG/issues).

## Examples

Suppose we want to generate data with the following causal structure:

where `age` is normally distributed with a mean of 50 and a standard deviation of 4 and `sex` is bernoulli distributed with `p = 0.5` (equal number of men and women). Both of these "root nodes" (meaning they have no parents - no arrows pointing into them) have a direct causal effect on the `bmi`. The causal coefficients are 1.1 and 0.4 respectively, with an intercept of 12 and a sigma standard deviation of 2. `death` is modeled as a bernoulli variable, which is caused by both `age` and `bmi` with causal coefficients of 0.1 and 0.3 respectively. As intercept we use -15. The following code can be used to generate 10000 samples from these specifications: ```{r} library(simDAG) dag <- empty_dag() + node("age", type="rnorm", mean=50, sd=4) + node("sex", type="rbernoulli", p=0.5) + node("bmi", type="gaussian", formula= ~ 12 + age*1.1 + sex*0.4, error=2) + node("death", type="binomial", formula= ~ -15 + age*0.1 + bmi*0.3) set.seed(42) sim_dat <- sim_from_dag(dag, n_sim=100000) ``` By fitting appropriate regression models, we can check if the data really does approximately conform to our specifications. First, lets look at the `bmi`: ```{r} mod_bmi <- glm(bmi ~ age + sex, data=sim_dat, family="gaussian") summary(mod_bmi) ``` This seems about right. Now we look at `death`: ```{r} mod_death <- glm(death ~ age + bmi, data=sim_dat, family="binomial") summary(mod_death) ``` The estimated coefficients are also very close to the ones we specified. More examples can be found in the documentation and the multiple vignettes. ## Citation If you use this package, please cite the associated article: Denz, Robin and Nina Timmesfeld (2025). Simulating Complex Crossectional and Longitudinal Data using the simDAG R Package. arXiv preprint, doi: 10.48550/arXiv.2506.01498. ## License 2024 Robin Denz The contents of this repository are distributed under the GNU General Public License. You can find the full text of this License in this github repository. Alternatively, see .

Owner

  • Name: Robin Denz
  • Login: RobinDenz1
  • Kind: user

I am a researcher at the Ruhr-University of Bochum in Germany and am currently enrolled as a PhD Student in "Epidemiology & Clinical Research".

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "identifier": "simDAG",
  "description": "Simulate complex data from a given directed acyclic graph and information about each individual node. Root nodes are simply sampled from the specified distribution. Child Nodes are simulated according to one of many implemented regressions, such as logistic regression, linear regression, poisson regression or any other function. Also includes a comprehensive framework for discrete-time simulation, and networks-based simulation which can generate even more complex longitudinal and dependent data. For more details, see Robin Denz, Nina Timmesfeld (2025) <doi:10.48550/arXiv.2506.01498>.",
  "name": "simDAG: Simulate Data from a DAG and Associated Node Information",
  "relatedLink": "https://robindenz1.github.io/simDAG/",
  "codeRepository": "https://github.com/RobinDenz1/simDAG",
  "issueTracker": "https://github.com/RobinDenz1/siMDAG/issues",
  "license": "https://spdx.org/licenses/GPL-3.0",
  "version": "0.4.1",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 4.5.0 (2025-04-11 ucrt)",
  "provider": {
    "@id": "https://cran.r-project.org",
    "@type": "Organization",
    "name": "Comprehensive R Archive Network (CRAN)",
    "url": "https://cran.r-project.org"
  },
  "author": [
    {
      "@type": "Person",
      "givenName": "Robin",
      "familyName": "Denz",
      "email": "robin.denz@rub.de"
    },
    {
      "@type": "Person",
      "givenName": "Katharina",
      "familyName": "Meiszl",
      "email": "meiszl@amib.rub-uni-bochum.de"
    }
  ],
  "maintainer": [
    {
      "@type": "Person",
      "givenName": "Robin",
      "familyName": "Denz",
      "email": "robin.denz@rub.de"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "knitr",
      "name": "knitr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=knitr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rmarkdown",
      "name": "rmarkdown",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rmarkdown"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "version": ">= 3.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "vdiffr",
      "name": "vdiffr",
      "version": ">= 1.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=vdiffr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "ggplot2",
      "name": "ggplot2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=ggplot2"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "ggforce",
      "name": "ggforce",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=ggforce"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "MASS",
      "name": "MASS",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=MASS"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "covr",
      "name": "covr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=covr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "foreach",
      "name": "foreach",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=foreach"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "doSNOW",
      "name": "doSNOW",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=doSNOW"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "doRNG",
      "name": "doRNG",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=doRNG"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "parallel",
      "name": "parallel"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "utils",
      "name": "utils"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "simr",
      "name": "simr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=simr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rsurv",
      "name": "rsurv",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rsurv"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "survival",
      "name": "survival",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=survival"
    }
  ],
  "softwareRequirements": {
    "1": {
      "@type": "SoftwareApplication",
      "identifier": "data.table",
      "name": "data.table",
      "version": ">= 1.15.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=data.table"
    },
    "2": {
      "@type": "SoftwareApplication",
      "identifier": "Rfast",
      "name": "Rfast",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=Rfast"
    },
    "3": {
      "@type": "SoftwareApplication",
      "identifier": "rlang",
      "name": "rlang",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rlang"
    },
    "4": {
      "@type": "SoftwareApplication",
      "identifier": "igraph",
      "name": "igraph",
      "version": ">= 2.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=igraph"
    },
    "5": {
      "@type": "SoftwareApplication",
      "identifier": "dagitty",
      "name": "dagitty",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dagitty"
    },
    "SystemRequirements": null
  },
  "fileSize": "1776.349KB",
  "citation": [
    {
      "@type": "CreativeWork",
      "datePublished": "2025",
      "author": [
        {
          "@type": "Person",
          "givenName": "Robin",
          "familyName": "Denz"
        },
        {
          "@type": "Person",
          "givenName": "Nina",
          "familyName": "Timmesfeld"
        }
      ],
      "name": "Simulating Complex Crossectional and Longitudinal Data using the simDAG R Package",
      "identifier": "10.48550/arXiv.2506.01498",
      "@id": "https://doi.org/10.48550/arXiv.2506.01498",
      "sameAs": "https://doi.org/10.48550/arXiv.2506.01498"
    }
  ],
  "releaseNotes": "https://github.com/RobinDenz1/simDAG/blob/master/NEWS.md",
  "readme": "https://github.com/RobinDenz1/simDAG/blob/main/README.md",
  "contIntegration": [
    "https://github.com/RobinDenz1/simDAG/actions/workflows/R-CMD-check.yaml",
    "https://app.codecov.io/gh/RobinDenz1/simDAG"
  ],
  "developmentStatus": "https://www.repostatus.org/#active",
  "keywords": [
    "causal-inference",
    "directed-acyclic-graph",
    "simulation"
  ]
}

GitHub Events

Total
  • Create event: 6
  • Issues event: 1
  • Release event: 6
  • Watch event: 4
  • Issue comment event: 5
  • Push event: 109
Last Year
  • Create event: 6
  • Issues event: 1
  • Release event: 6
  • Watch event: 4
  • Issue comment event: 5
  • Push event: 109

Packages

  • Total packages: 1
  • Total downloads:
    • cran 341 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 10
  • Total maintainers: 1
cran.r-project.org: simDAG

Simulate Data from a DAG and Associated Node Information

  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 341 Last month
Rankings
Forks count: 28.3%
Dependent packages count: 28.4%
Stargazers count: 35.0%
Dependent repos count: 37.0%
Average: 43.3%
Downloads: 88.0%
Maintainers (1)
Last synced: 4 months ago

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action v4.4.1 composite
  • actions/checkout v3 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • data.table * depends
  • Rfast * imports
  • dplyr * imports
  • rlang * imports
  • MASS * suggests
  • covr * suggests
  • ggforce * suggests
  • ggplot2 * suggests
  • igraph * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests
  • vdiffr >= 1.0.0 suggests