mrgsim.parallel
Parallel simulation with mrgsolve and futures
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary
Keywords
future
mrgsolve
parallelization
Last synced: 6 months ago
·
JSON representation
Repository
Parallel simulation with mrgsolve and futures
Basic Info
- Host: GitHub
- Owner: kylebaron
- Language: R
- Default Branch: main
- Homepage: https://kylebaron.github.io/mrgsim.parallel
- Size: 650 KB
Statistics
- Stars: 5
- Watchers: 4
- Forks: 0
- Open Issues: 4
- Releases: 1
Topics
future
mrgsolve
parallelization
Created over 6 years ago
· Last pushed 7 months ago
Metadata Files
Readme
Changelog
README.Rmd
---
title: ""
output: github_document
---
```{r,setup,include=FALSE}
knitr::opts_chunk$set(comment = '.', message=FALSE, warning = FALSE,
fig.path="man/figures/README-")
```
# mrgsim.parallel
## Overview
mrgsolve.parallel facilitates parallel simulation with
mrgsolve in R. The future and parallel packages provide the parallelization.
There are 2 main workflows:
1. Split a `data_set` into chunks by ID, simulate the chunks in parallel, then
assemble the results back to a single data frame.
1. Split an `idata_set` (individual-level parameters) into chunks by row,
simulate the chunks in parallel, then assemble the results back to a single
data frame.
The nature of the parallel backend requires some overhead to get the
parallel simulation done. So, it will take a reasonably-sized job to see
a speed increase and small jobs will likely take *longer* with parallelization.
But jobs taking more than a handful of seconds could benefit from this type
of parallelization.
```{r,include = FALSE}
options(mrgsolve.soloc = "build")
```
## Backend
```{r}
library(dplyr)
library(future)
library(mrgsim.parallel)
options(future.fork.enable = TRUE, parallelly.fork.enable = TRUE, mc.cores = 4L)
```
## First workflow: split and simulate a data set
```{r}
mod <- modlib("pk2cmt", end = 168*8, delta = 1)
data <- expand.ev(amt = 100*seq(1,2000), ii = 24, addl = 27*2+2)
data <- mutate(data, CL = runif(n(), 0.7, 1.3))
head(data)
dim(data)
```
We can simulate in parallel with the future package or the parallel package like this:
```{r}
plan(multisession, workers = 4L)
system.time(ans1 <- future_mrgsim_d(mod, data, nchunk = 4L))
plan(multicore, workers = 4L)
system.time(ans1b <- future_mrgsim_d(mod, data, nchunk = 4L))
system.time(ans2 <- mc_mrgsim_d(mod, data, nchunk = 4L))
```
To compare an identical simulation done without parallelization
```{r}
system.time(ans3 <- mrgsim_d(mod,data))
```
```{r}
identical(ans2,as.data.frame(ans3))
```
## Second workflow: split and simulate a batch of parameters
Backend and the model
```{r}
plan(multisession, workers = 6)
mod <- modlib("pk1cmt", end = 168*4, delta = 1)
```
For this workflow, we have a set of parameters (`idata`) along with an
event object that gets applied to all of the parameters
```{r}
idata <- tibble(CL = runif(4000, 0.5, 1.5), ID = seq_along(CL))
head(idata)
```
```{r}
dose <- ev(amt = 100, ii = 24, addl = 27)
dose
```
Run it in parallel
```{r}
system.time(ans1 <- mc_mrgsim_ei(mod, dose, idata, nchunk = 6))
```
And without parallelization
```{r}
system.time(ans2 <- mrgsim_ei(mod, dose, idata, output = "df"))
identical(ans1,ans2)
```
## Utility functions
You can access the chunking functions for your own parallel workflows
```{r}
dose <- ev_seq(ev(amt = 100), ev(amt = 50, ii = 12, addl = 2))
dose <- ev_rep(dose, 1:5)
dose
chunk_by_id(dose, nchunk = 2)
```
See also: `chunk_by_row`
## Do a dry run to check the overhead of parallelization
```{r}
plan(transparent)
system.time(x <- fu_mrgsim_d(mod, data, nchunk = 8, .dry = TRUE))
plan(multisession, workers = 8L)
system.time(x <- fu_mrgsim_d(mod, data, nchunk = 8, .dry = TRUE))
```
## Pass a function to post process on the worker
First check the range of times from the previous example
```{r}
summary(ans1$time)
```
The post-processing function has arguments the simulated data and the
model object
```{r}
post <- function(sims, mod) {
filter(sims, time > 600)
}
dose <- ev(amt = 100, ii = 24, addl = 27)
ans3 <- mc_mrgsim_ei(mod, dose, idata, nchunk = 6, .p = post)
```
```{r}
summary(ans3$time)
```
The main use case here is to summarize or some how decrease the volume of data
before returning the combined simulations. In case memory is able to handle
the simulation volume, this post-processing could be done on the combined
data as well.
## More info
See [inst/docs/stories.md (on GitHub only)](inst/docs/stories.md) for more details.
Owner
- Name: Kyle Baron
- Login: kylebaron
- Kind: user
- Location: Savage, MN
- Repositories: 9
- Profile: https://github.com/kylebaron
GitHub Events
Total
- Push event: 4
- Pull request event: 4
- Create event: 2
Last Year
- Push event: 4
- Pull request event: 4
- Create event: 2
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Kyle Baron | k****b@m****m | 139 |
Committer Domains (Top 20 + Academic)
metrumrg.com: 1
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 14
- Total pull requests: 10
- Average time to close issues: 4 months
- Average time to close pull requests: 26 days
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 0.21
- Average comments per pull request: 0.0
- Merged pull requests: 9
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: 2 days
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- kylebaron (14)
Pull Request Authors
- kylebaron (14)
- kyleam (1)
Top Labels
Issue Labels
enhancement (7)
low-risk (4)
medium-risk (3)
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 219 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 4
- Total maintainers: 1
cran.r-project.org: mrgsim.parallel
Simulate with 'mrgsolve' in Parallel
- Homepage: https://github.com/kylebaron/mrgsim.parallel
- Documentation: http://cran.r-project.org/web/packages/mrgsim.parallel/mrgsim.parallel.pdf
- License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
-
Latest release: 0.3.0
published 7 months ago
Rankings
Stargazers count: 22.5%
Forks count: 28.8%
Dependent packages count: 29.8%
Dependent repos count: 35.5%
Average: 36.2%
Downloads: 64.4%
Maintainers (1)
Last synced:
6 months ago
Dependencies
DESCRIPTION
cran
- R >= 3.5.0 depends
- mrgsolve * depends
- callr * imports
- dplyr * imports
- fst * imports
- future * imports
- future.apply * imports
- parallel * imports
- arrow * suggests
- knitr * suggests
- qs * suggests
- rmarkdown * suggests
- testthat * suggests