mrgsim.parallel

Parallel simulation with mrgsolve and futures

https://github.com/kylebaron/mrgsim.parallel

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary

Keywords

future mrgsolve parallelization

Last synced: 6 months ago · JSON representation

Repository

Parallel simulation with mrgsolve and futures

Basic Info

Host: GitHub
Owner: kylebaron
Language: R
Default Branch: main
Homepage: https://kylebaron.github.io/mrgsim.parallel
Size: 650 KB

Statistics

Stars: 5
Watchers: 4
Forks: 0
Open Issues: 4
Releases: 1

Topics

future mrgsolve parallelization

Created over 6 years ago · Last pushed 7 months ago

Metadata Files

Readme Changelog

README.Rmd

---
title: ""
output: github_document
---

```{r,setup,include=FALSE}
knitr::opts_chunk$set(comment = '.', message=FALSE, warning = FALSE, 
                      fig.path="man/figures/README-")
```

# mrgsim.parallel



## Overview
mrgsolve.parallel facilitates parallel simulation with 
mrgsolve in R.  The future and parallel packages provide the parallelization.  

There are 2 main workflows:

1. Split a `data_set` into chunks by ID, simulate the chunks in parallel, then 
   assemble the results back to a single data frame.
1. Split an `idata_set` (individual-level parameters) into chunks by row, 
   simulate the chunks in parallel, then assemble the results back to a single
   data frame.

The nature of the parallel backend requires some overhead to get the 
parallel simulation done.  So, it will take a reasonably-sized job to see 
a speed increase and small jobs will likely take *longer* with parallelization.
But jobs taking more than a handful of seconds could benefit from this type 
of parallelization.


```{r,include = FALSE}
options(mrgsolve.soloc = "build")
```

## Backend 

```{r}
library(dplyr)

library(future)

library(mrgsim.parallel)

options(future.fork.enable = TRUE, parallelly.fork.enable = TRUE, mc.cores = 4L)
```
## First workflow: split and simulate a data set

```{r}
mod <- modlib("pk2cmt", end = 168*8, delta = 1)

data <- expand.ev(amt = 100*seq(1,2000), ii = 24, addl = 27*2+2) 

data <- mutate(data, CL = runif(n(), 0.7, 1.3))

head(data)

dim(data)
```

We can simulate in parallel with the future package or the parallel package like this:
```{r}
plan(multisession, workers = 4L)
system.time(ans1 <- future_mrgsim_d(mod, data, nchunk = 4L))

plan(multicore, workers = 4L)
system.time(ans1b <- future_mrgsim_d(mod, data, nchunk = 4L))


system.time(ans2 <- mc_mrgsim_d(mod, data, nchunk = 4L))
```

To compare an identical simulation done without parallelization
```{r}
system.time(ans3 <- mrgsim_d(mod,data))
```

```{r}
identical(ans2,as.data.frame(ans3))
```


## Second workflow: split and simulate a batch of parameters

Backend and the model
```{r}
plan(multisession, workers = 6)

mod <- modlib("pk1cmt", end = 168*4, delta = 1)
```

For this workflow, we have a set of parameters (`idata`) along with an 
event object that gets applied to all of the parameters
```{r}
idata <- tibble(CL = runif(4000, 0.5, 1.5), ID = seq_along(CL))

head(idata)
```

```{r}
dose <- ev(amt = 100, ii = 24, addl = 27)

dose
```

Run it in parallel
```{r}
system.time(ans1 <- mc_mrgsim_ei(mod, dose, idata, nchunk = 6))
```

And without parallelization

```{r}
system.time(ans2 <- mrgsim_ei(mod, dose, idata, output = "df"))

identical(ans1,ans2)
```

## Utility functions 

You can access the chunking functions for your own parallel workflows

```{r}
dose <- ev_seq(ev(amt = 100), ev(amt = 50, ii = 12, addl = 2))
dose <- ev_rep(dose, 1:5)

dose

chunk_by_id(dose, nchunk = 2)
```

See also: `chunk_by_row`

## Do a dry run to check the overhead of parallelization

```{r}
plan(transparent)
system.time(x <- fu_mrgsim_d(mod, data, nchunk = 8, .dry = TRUE))

plan(multisession, workers = 8L)
system.time(x <- fu_mrgsim_d(mod, data, nchunk = 8, .dry = TRUE))

```

## Pass a function to post process on the worker

First check the range of times from the previous example

```{r}
summary(ans1$time)
```

The post-processing function has arguments the simulated data and the 
model object
```{r}
post <- function(sims, mod) {
  filter(sims, time > 600)  
}

dose <- ev(amt = 100, ii = 24, addl = 27)

ans3 <- mc_mrgsim_ei(mod, dose, idata, nchunk = 6, .p = post)
```

```{r}
summary(ans3$time)

```

The main use case here is to summarize or some how decrease the volume of data
before returning the combined simulations.  In case memory is able to handle
the simulation volume, this post-processing could be done on the combined
data as well.




## More info

See [inst/docs/stories.md (on GitHub only)](inst/docs/stories.md) for more details.

Owner

Name: Kyle Baron
Login: kylebaron
Kind: user
Location: Savage, MN

Repositories: 9
Profile: https://github.com/kylebaron

GitHub Events

Total

Push event: 4
Pull request event: 4
Create event: 2

Last Year

Push event: 4
Pull request event: 4
Create event: 2

Committers

Last synced: over 2 years ago

All Time

Total Commits: 139
Total Committers: 1
Avg Commits per committer: 139.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Kyle Baron	k**b@m**m	139

Committer Domains (Top 20 + Academic)

metrumrg.com: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 14
Total pull requests: 10
Average time to close issues: 4 months
Average time to close pull requests: 26 days
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 0.21
Average comments per pull request: 0.0
Merged pull requests: 9
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 2
Average time to close issues: N/A
Average time to close pull requests: 2 days
Issue authors: 1
Pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

kylebaron (14)

Pull Request Authors

kylebaron (14)
kyleam (1)

Top Labels

Issue Labels

enhancement (7) low-risk (4) medium-risk (3)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- cran 219 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 4
Total maintainers: 1

cran.r-project.org: mrgsim.parallel

Simulate with 'mrgsolve' in Parallel

Homepage: https://github.com/kylebaron/mrgsim.parallel
Documentation: http://cran.r-project.org/web/packages/mrgsim.parallel/mrgsim.parallel.pdf
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Latest release: 0.3.0
published 7 months ago

Versions: 4
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 219 Last month

Rankings

Stargazers count: 22.5%

Forks count: 28.8%

Dependent packages count: 29.8%

Dependent repos count: 35.5%

Average: 36.2%

Downloads: 64.4%

Maintainers (1)

kylebtwin@imap.cc

Last synced: 6 months ago

Dependencies

DESCRIPTION cran

R >= 3.5.0 depends
mrgsolve * depends
callr * imports
dplyr * imports
fst * imports
future * imports
future.apply * imports
parallel * imports
arrow * suggests
knitr * suggests
qs * suggests
rmarkdown * suggests
testthat * suggests

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

mrgsim.parallel

Science Score: 26.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.Rmd

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: mrgsim.parallel

Rankings

Maintainers (1)

Dependencies