jetty

Execute R functions within the context of a Docker container.

https://github.com/dmolitor/jetty

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.7%) to scientific vocabulary

Keywords

docker reproducible-research rstats
Last synced: 6 months ago · JSON representation

Repository

Execute R functions within the context of a Docker container.

Basic Info
Statistics
  • Stars: 8
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 3
Topics
docker reproducible-research rstats
Created over 2 years ago · Last pushed 10 months ago
Metadata Files
Readme Changelog License

README.Rmd

---
output:
  github_document:
    toc: false
always_allow_html: yes
editor_options:
  markdown:
    wrap: sentence
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "80%",
  dpi = 300
)

# Explicitly ignore any existing .Rprofile or .Renviron files
options("jetty.ignore.rprofile" = TRUE)
options("jetty.ignore.renviron" = TRUE)
```

# jetty 


[![pkgdown](https://github.com/dmolitor/jetty/actions/workflows/pkgdown.yaml/badge.svg)](https://github.com/dmolitor/jetty/actions/workflows/pkgdown.yaml)
[![R-CMD-check](https://github.com/dmolitor/jetty/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/dmolitor/jetty/actions/workflows/R-CMD-check.yaml)
[![CRAN status](https://www.r-pkg.org/badges/version/jetty)](https://CRAN.R-project.org/package=jetty)


> Execute R functions or code blocks within a Docker container.

It may be useful, in certain circumstances, to perform a computation in a 
separate R process that is running within a Docker container. This package
attempts to achieve this!

## Features

- Calls an R function with arguments or a code block in a subprocess 
within a Docker container.

- Copies function arguments (as necessary) to the subprocess and copies the 
return value of the function/code block.

- Discovers and installs required packages in the Docker container at run-time.

- Copies error objects back from the subprocess. In general, these error objects
do not include the stack trace from the Docker R process. However, if for example
the error is an rlang error, it will include the full stack trace.

- Shows and/or collects the standard output and standard error of the Docker
subprocess.

- Executes an R script in a subprocess within a Docker container.
The user specifies a directory to mount, enabling the script to interact with its contents.

## Installation

Install jetty from CRAN:
```{r, eval = FALSE}
install.packages("jetty")
```

Or install the development version of jetty from GitHub:
```{r, eval = FALSE}
# install.packages("pak")
pak::pkg_install("dmolitor/jetty")
```

## Synchronous, one-off R processes in a Docker container

Use `run()` to execute an R function or code block in a new R process within a
Docker container. The results are passed back directly to the local R session.

```{r}
jetty::run(function() var(iris[, 1:4]))
```

### Specifying Docker container

The desired Docker container can be set via the `image` argument, and should be
specified as a string in standard Docker format. These formats include
`username/image:tag`, `username/image`, `image:tag`, and `image`. The default
choice is `r-base:{jetty:::r_version()}` which is a bare-bones R image that mirrors
the R version running locally. For example, the following command would be
executed in the official [`r-base`](https://hub.docker.com/_/r-base)
image with the latest version of R, which comes with no packages beyond the
base set installed:

```{r, eval = FALSE}
jetty::run(function() var(iris[, 1:4]), image = "r-base:latest")
```

### Passing arguments

You can pass arguments to the function by setting `args` to the list of
arguments, similar to the base `do.call` function. This is often necessary, as
the function being evaluated in the Docker R process does not have access to
variables in the parent process. For example, the following does not work:
```{r, error = TRUE}
mycars <- cars
jetty::run(function() summary(mycars))
```

But this does:
```{r}
mycars <- cars
jetty::run(function(x) summary(x), args = list(mycars))
```

### Using packages

You can use any package in the child R process, with the caveat that the
package must be installed in the Docker container. While it's recommended to
refer to it explicitly with the `::` operator, the code snippet can also call
`library()` or `require()` and will work fine. For example, the following code
snippets both work equally well:

```{r}
jetty::run(
  {
    library(Matrix);
    function(nrow, ncol) rsparsematrix(nrow, ncol, density = 1)
  },
  args = list(nrow = 10, ncol = 2)
)
```

and
```{r}
jetty::run(
  function(nrow, ncol) Matrix::rsparsematrix(nrow, ncol, density = 1),
  args = list(nrow = 10, ncol = 2)
)
```

#### Installing required packages

jetty also supports installing required packages at runtime. For example,
the following code will fail because the required packages are not installed
in the Docker image:
```{r, error = TRUE}
jetty::run(
  {
    ggplot2::ggplot(mtcars, ggplot2::aes(x = hp, y = mpg)) +
      ggplot2::geom_point()
  }
)
```

However, by setting `install_dependencies = TRUE` we can tell jetty to discover
the required packages and install them before executing the code:
```{r}
jetty::run(
  {
    ggplot2::ggplot(mtcars, ggplot2::aes(x = hp, y = mpg)) +
      ggplot2::geom_point()
  },
  install_dependencies = TRUE,
  stdout = FALSE
)
```

**Note**: this feature uses [`renv::dependencies`](https://rstudio.github.io/renv/reference/dependencies.html)
to discover the required packages, and won't handle all possible scenarios.
In particular, it won't install specific package versions (just the latest version)
and it will only install packages that are on CRAN. Use this with care!

### Error handling

jetty copies errors from the child R process to the main R session:

```{r, error = TRUE}
jetty::run(function() 1 + "A")
```

Although the errors themselves are propagated to the main R session, the stack
trace is (currently) not propagated. This means that calling functions such as
`traceback()` and `rlang::last_trace()` won't be of any help.

### Standard output and error

By default, the standard output and error of the Docker subprocess are printed
to the R console. However, since jetty uses `system2()` to execute all Docker
commands, you can specify the `stdout` and `stderr` arguments which will
be passed directly to `system2()`. For example the following code will print
a series of messages to the console:
```{r, eval = FALSE, message = TRUE}
jetty::run({for (i in 1:5) message(paste0("iter", i)); TRUE})
#> iter1
#> iter2
#> iter3
#> iter4
#> iter5
#> [1] TRUE
```

But you can discard this output by setting `stdout` to one of `c(FALSE, TRUE, NULL)`:
```{r, eval = FALSE, message = TRUE}
jetty::run({for (i in 1:5) message(paste0("iter", i)); TRUE}, stdout = FALSE)
#> [1] TRUE
```

To see more details on controlling `stdout` and `stderr`, check out the
[documentation here](https://stat.ethz.ch/R-manual/R-devel/library/base/html/system2.html).

### .Rprofile and .Renviron

jetty also provides some support for `.Rprofile` and `.Renviron` files. By
default, jetty will search for files called ".Rprofile" and ".Renviron" in
the current working directory. If these files exist, jetty will port them
to the Docker execution environment and will execute any code in `.Rprofile`
and load all environment variables in `.Renviron` before executing the provided
R code. If the `.Rprofile` file uses external packages, it is
essential to tell jetty to install required packages (as described above)
otherwise the code will fail.

The user can explicitly provide `.Rprofile` and `.Renviron` file paths via the
`r_profile` and `r_environ` arguments. For example, the following code will
attach the `.Rprofile` found in the `/man/scaffolding/` sub-directory of the
current working directory. This file simply uses the
[praise](https://github.com/rladies/praise) package to provide some
encouragement at the start of a new R session.
```{r, eval = FALSE, message = TRUE}
four <- jetty::run(
  \() 2 + 2,
  r_profile = here::here("man/scaffolding/.Rprofile"),
  install_dependencies = TRUE
)
#> Installing package into ‘/usr/local/lib/R/site-library’
#> (as ‘lib’ is unspecified)
#> trying URL 'https://r-lib.github.io/p/pak/stable/source/linux-gnu/aarch64/src/contrib/../../../../../linux/aarch64/pak_0.8.0_R-4-4_aarch64-linux.tar.gz'
#> Content type 'application/gzip' length 8847947 bytes (8.4 MB)
#> ==================================================
#> downloaded 8.4 MB
#> 
#> * installing *binary* package ‘pak’ ...
#> * DONE (pak)
#> 
#> The downloaded source packages are in
#> 	‘/tmp/RtmpNxnaJk/downloaded_packages’
#> ✔ Updated metadata database: 3.07 MB in 8 files.
#> ✔ Updating metadata database ... done
#>  
#> → Will install 1 package.
#> → Will download 1 CRAN package (6.10 kB).
#> + praise   1.0.0 [bld][dl] (6.10 kB)
#>   
#> ℹ Getting 1 pkg (6.10 kB)
#> ✔ Got praise 1.0.0 (source) (6.10 kB)
#> ℹ Building praise 1.0.0
#> ✔ Built praise 1.0.0 (403ms)
#> ✔ Installed praise 1.0.0  (7ms)
#> ✔ 1 pkg: added 1, dld 1 (6.10 kB) [3.8s]
#> You are exquisite!
```

However, as noted above, this fails if `install_dependencies = FALSE`.
```{r, error = TRUE}
four <- jetty::run(
  \() 2 + 2,
  r_profile = here::here("man/scaffolding/.Rprofile")
)
```

#### Ignoring .Rprofile and .Renviron files

If the user wants to explicitly ignore an existing `.Rprofile` or
`.Renviron` file in the current working directory, it's as simple as setting
the `r_profile = NULL` and `r_environ = NULL`:
```{r}
four <- jetty::run(\() 2 + 2, r_profile = NULL, r_environ = NULL)
```

If the user wants to ignore these files for all jetty function calls, they can
do so by setting the following system environment variables:
```{r, eval=FALSE}
Sys.setenv("JETTY_IGNORE_RPROFILE" = TRUE)
Sys.setenv("JETTY_IGNORE_RENVIRON" = TRUE)
```

or, equivalently, the following R options:
```{r, eval=FALSE}
options("jetty.ignore.rprofile" = TRUE)
options("jetty.ignore.renviron" = TRUE)
```

#### Multiple .Rprofile or .Renviron files

Currently jetty only supports single `.Rprofile` or `.Renviron` files. So,
for example, if a user has a project-specific .Rprofile in the current
working directory at `./.Rprofile` and then a user-specific .Rprofile at
`~/.Rprofile`, jetty will only source `./.Rprofile` and will ignore
`~/.Rprofile`. This is a feature I plan to add before long.

## Executing R scripts in a Docker container

While the primary goal of jetty is to execute a function or
code chunk in an R subprocess running within a Docker container, it also
supports the execution of entire scripts via the `run_script()` function. This
feature may be useful when you want to execute a script in an isolated
environment such as for reproducible scientific code. It is particularly
helpful when executing scripts that require specific R packages,
different versions of R, or a clean environment to avoid conflicts with your
system's setup.

In order to allow seamless interactions between the Docker subprocess
and the local file system, the user must specify an execution context—a
local directory that will be mounted into the Docker container. This
context directory ensures that the script can access files within it, enabling
the script to read data from or write results back to that directory.
The context directory is important because it limits the script's file access to this
directory, preventing it from interacting with files outside of the
specified scope.

For example, suppose we are working within an R project and the script we want
to execute needs access to all files within the project. We
can achieve this by setting the context directory as the full project directory:
```{r, eval = FALSE}
jetty::run_script(
  file = here::here("code/awesome_script.R"),
  context = here::here()
)
```

`run_script()` and `run()` share a lot of functionality. For example,
if the script above relies on packages that aren't installed in the Docker
container, you can instruct jetty to install these packages
at runtime:
```{r, eval = FALSE}
jetty::run_script(
  file = here::here("code/awesome_script.R"),
  context = here::here(),
  install_dependencies = TRUE
)
```

All the features discussed above for synchronous, one-off R processes also apply to `run_script()`.

Owner

  • Name: Daniel Molitor
  • Login: dmolitor
  • Kind: user

GitHub Events

Total
  • Create event: 5
  • Release event: 3
  • Issues event: 4
  • Watch event: 7
  • Delete event: 1
  • Issue comment event: 2
  • Push event: 42
Last Year
  • Create event: 5
  • Release event: 3
  • Issues event: 4
  • Watch event: 7
  • Delete event: 1
  • Issue comment event: 2
  • Push event: 42

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 53
  • Total Committers: 1
  • Avg Commits per committer: 53.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 27
  • Committers: 1
  • Avg Commits per committer: 27.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Daniel Molitor m****7@g****m 53

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2
  • Total pull requests: 0
  • Average time to close issues: 1 day
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: 1 day
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • dmolitor (2)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 157 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
cran.r-project.org: jetty

Execute R in a 'Docker' Context

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 157 Last month
Rankings
Dependent packages count: 27.5%
Dependent repos count: 33.8%
Average: 49.4%
Downloads: 87.0%
Maintainers (1)
Last synced: 7 months ago

Dependencies

DESCRIPTION cran
  • jsonlite * imports
  • rlang * imports