https://github.com/barbarabodinier/fake

R package fake (Flexible Data Simulation Using The Multivariate Normal Distribution).

https://github.com/barbarabodinier/fake

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.2%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

R package fake (Flexible Data Simulation Using The Multivariate Normal Distribution).

Basic Info
  • Host: GitHub
  • Owner: barbarabodinier
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 349 KB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created almost 4 years ago · Last pushed about 3 years ago
Metadata Files
Readme

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# fake: Flexible Data Simulation Using The Multivariate Normal Distribution 


[![CRAN version](https://www.r-pkg.org/badges/version/fake)](https://cran.r-project.org/package=fake)
[![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/last-month/fake?color=blue)](https://r-pkg.org/pkg/fake)
![GitHub last commit](https://img.shields.io/github/last-commit/barbarabodinier/fake?logo=GitHub&style=flat-square)
  

## Description

> This R package can be used to generate artificial data conditionally on pre-specified (simulated or user-defined) relationships between the variables and/or observations. Each observation is drawn from a multivariate Normal distribution where the mean vector and covariance matrix reflect the desired relationships. Outputs can be used to evaluate the performances of variable selection, graphical modelling, or clustering approaches by comparing the true and estimated structures.


## Installation

The released version of the package can be installed from [CRAN](https://CRAN.R-project.org) with:

```{r installation cran, eval=FALSE}
install.packages("fake")
```

The development version can be installed from [GitHub](https://github.com/):

```{r installation github, eval=FALSE}
remotes::install_github("barbarabodinier/fake")
```

## Main functions

### Linear model

```{r linear regression, eval=FALSE}
library(fake)

set.seed(1)
simul <- SimulateRegression(n = 100, pk = 20)
head(simul$xdata)
head(simul$ydata)
```

### Logistic model

```{r logistic regression, eval=FALSE}
set.seed(1)
simul <- SimulateRegression(n = 100, pk = 20, family = "binomial")
head(simul$ydata)
```

### Structural causal model

```{r structural causal model, eval=FALSE}
set.seed(1)
simul <- SimulateStructural(n = 100, pk = c(3, 2, 3))
head(simul$data)
```

### Gaussian graphical model

```{r graphical modelling, eval=FALSE}
set.seed(1)
simul <- SimulateGraphical(n = 100, pk = 20)
head(simul$data)
```

### Gaussian mixture model

```{r clustering, eval=FALSE}
set.seed(1)
simul <- SimulateClustering(n = c(10, 10, 10), pk = 20)
head(simul$data)
```

## Extraction and visualisation of the results

The true model structure is returned in the output of any of the main functions in:

```{r theta, eval=FALSE}
simul$theta
```

The functions `print()`, `summary()` and `plot()` can be used on the outputs from the main functions.

## Reference

- Barbara Bodinier, Sarah Filippi, Therese Haugdahl Nost, Julien Chiquet and Marc Chadeau-Hyam. Automated calibration for stability selection in penalised regression and graphical models: a multi-OMICs network application exploring the molecular response to tobacco smoking. (2021) arXiv. [link](https://doi.org/10.48550/arXiv.2106.02521)

## Other resources

- R scripts to reproduce the simulation study (Bodinier et al. 2021) conducted using the functions in **fake** [link](https://github.com/barbarabodinier/stability_selection)

- R package **sharp** for stability selection and consensus clustering [link](https://github.com/barbarabodinier/sharp)

Owner

  • Name: Barbara Bodinier
  • Login: barbarabodinier
  • Kind: user
  • Company: Imperial College London

Postdoctoral researcher in Biostatistics

GitHub Events

Total
  • Pull request event: 1
Last Year
  • Pull request event: 1

Dependencies

DESCRIPTION cran
  • MASS * imports
  • Rdpack * imports
  • huge * imports
  • igraph * imports
  • withr >= 2.4.0 imports
  • testthat >= 3.0.0 suggests
.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite