pbapply

Adding progress bar to '*apply' functions in R

https://github.com/psolymos/pbapply

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.4%) to scientific vocabulary

Keywords

cran progress-bar r r-package rstats rstats-package

Keywords from Contributors

ecology estimation lele rsf rspf solymos weighted-distributions
Last synced: 6 months ago · JSON representation

Repository

Adding progress bar to '*apply' functions in R

Basic Info
Statistics
  • Stars: 160
  • Watchers: 8
  • Forks: 6
  • Open Issues: 4
  • Releases: 9
Topics
cran progress-bar r r-package rstats rstats-package
Created over 11 years ago · Last pushed 7 months ago
Metadata Files
Readme Changelog

README.md

pbapply: adding progress bar to '*apply' functions in R

CRAN version CRAN RStudio mirror downloads check

A lightweight package that adds progress bar to vectorized R functions (*apply). The implementation can easily be added to functions where showing the progress is useful (e.g. bootstrap). The type and style of the progress bar (with percentages or remaining time) can be set through options. The package supports several parallel processing backends, such as snow-type and mirai clusters, multicore-type forking, and future.

Versions

Install CRAN release version (recommended):

R install.packages("pbapply")

Development version:

R install.packages("pbapply", repos = "https://psolymos.r-universe.dev")

See user-visible changes in the NEWS file.

Use the issue tracker to report a problem, or to suggest a new feature.

How to get started?

1. You are not yet an R user

In this case, start with understanding basic programming concepts, such as data structures (matrices, data frames, indexing these), for loops and functions in R. The online version of Garrett Grolemund's Hands-On Programming with R walks you through these concepts nicely.

2. You are an R user but haven't used vectorized functions yet

Learn about vectorized functions designed to replace for loops: lapply, sapply, and apply. Here is a repository called The Road to Progress that I created to show you how to go from a for loop to lapply/sapply.

Watch the video

3. You are an R user familiar with vectorized functions

In this case, you can simply add pbapply::pb before your *apply functions, e.g. apply() will become pbapply::pbapply(), etc. You can guess what happens. Now if you want to speed things up a little (or a lot), try pbapply::pbapply(..., cl = 4) to use 4 cores instead of 1.

If you are a Windows user, things get a bit more complicated, but not much. Check how to work with parallel::parLapply to set up a snow type cluster or use a suitable future backend (see some examples below). Have a look at the The Road to Progress repository to see more worked examples.

4. You are a seasoned R developer writing your own packages

Read on, the next section is for you.

How to add pbapply to a package

There are two ways of adding the pbapply package to another package.

1. Suggests: pbapply

Add pbapply to the Suggests field in the DESCRIPTION.

Use a conditional statement in your code to fall back on a base function in case of pbapply is not installed:

R out <- if (requireNamespace("pbapply", quietly = TRUE)) { pbapply::pblapply(X, FUN, ...) } else { lapply(X, FUN, ...) }

See a small example package here.

2. Depends/Imports: pbapply

Add pbapply to the Depends or Imports field in the DESCRIPTION.

Use the pbapply functions either as pbapply::pblapply() or specify them in the NAMESPACE (importFrom(pbapply, pblapply)) and use it as pblapply() (without the ::). You'd have to add a comment #' @importFrom pbapply pblapply if you are using roxygen2.

Customizing the progress bar in your package

Specify the progress bar options in the zzz.R file of the package:

R .onAttach <- function(libname, pkgname){ options("pboptions" = list( type = if (interactive()) "timer" else "none", char = "-", txt.width = 50, gui.width = 300, style = 3, initial = 0, title = "R progress bar", label = "", nout = 100L, min_time = 2)) invisible(NULL) }

This will set the options and pbapply will not override these when loaded.

See a small example package here.

Suppressing the progress bar in your functions

Suppressing the progress bar is sometimes handy. By default, progress bar is suppressed when !interactive(). In other instances, put this inside a function:

R pbo <- pboptions(type = "none") on.exit(pboptions(pbo), add = TRUE)

Working with a future backend

The future backend might require additional arguments to be set by package developers to avoid warnings for end users. Most notably, you will have to determine how to handle random number generation as part of parallel evaluation. You can pass the future.seed argument directly through .... In general, ass any additional arguments to FUN immediately following the FUN argument, and any additional arguments to the the future backend after cl = "future" statement:

R pblapply(1:2, FUN = my_fcn, {additional my_fcn args}, cl = "future", {additional future args})

See this issue for a discussion.

Examples

The following pb* functions are available in the pbapply package:

| base | pbapply | works in parallel | |--------------|-----------------|-------------------| | apply | pbapply | ✅ | | by | pbby | ✅ | | eapply | pbeapply | ✅ | | lapply | pblapply | ✅ | | .mapply | pb.mapply | ❌ | | mapply | pbmapply | ❌ | | Map | pbMap | ❌ | | replicate | pbreplicate | ✅ | | sapply | pbsapply | ✅ | | tapply | pbtapply | ✅ | | vapply | pbvapply | ✅ | | ❌ | pbwalk | ✅ |

Command line usage

```R library(pbapply) set.seed(1234) n <- 2000 x <- rnorm(n) y <- rnorm(n, model.matrix(~x) %*% c(0,1), sd=0.5) d <- data.frame(y, x)

model fitting and bootstrap

mod <- lm(y~x, d) ndat <- model.frame(mod) B <- 500 bid <- sapply(1:B, function(i) sample(nrow(ndat), nrow(ndat), TRUE)) fun <- function(z) { if (missing(z)) z <- sample(nrow(ndat), nrow(ndat), TRUE) coef(lm(mod$call$formula, data=ndat[z,])) }

standard '*apply' functions

system.time(res1 <- lapply(1:B, function(i) fun(bid[,i])))

user system elapsed

1.096 0.023 1.127

system.time(res2 <- sapply(1:B, function(i) fun(bid[,i])))

user system elapsed

1.152 0.017 1.182

system.time(res3 <- apply(bid, 2, fun))

user system elapsed

1.134 0.010 1.160

system.time(res4 <- replicate(B, fun()))

user system elapsed

1.141 0.022 1.171

'pb*apply' functions

try different settings:

"none", "txt", "tk", "win", "timer"

op <- pboptions(type="timer") # default system.time(res1pb <- pblapply(1:B, function(i) fun(bid[,i])))

|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% ~00s

user system elapsed

1.539 0.046 1.599

pboptions(op)

pboptions(type="txt") system.time(res2pb <- pbsapply(1:B, function(i) fun(bid[,i])))

|++++++++++++++++++++++++++++++++++++++++++++++++++| 100%

user system elapsed

1.433 0.045 1.518

pboptions(op)

pboptions(type="txt", style=1, char="=") system.time(res3pb <- pbapply(bid, 2, fun))

==================================================

user system elapsed

1.389 0.032 1.464

pboptions(op)

pboptions(type="txt", char=":") system.time(res4pb <- pbreplicate(B, fun()))

|::::::::::::::::::::::::::::::::::::::::::::::::::| 100%

user system elapsed

1.427 0.040 1.481

pboptions(op) ```

Parallel backends

You have a few different options to choose from as a backend. This all comes down to the cl argument in the pb* functions.

  • cl = NULL (default): sequential execution
  • cl is of class cluster: this implies that you used cl = parallel::makeCluster(n) or something similar (n being the number of worker nodes)
  • cl is a positive integer (usually > 1): forking type parallelism is used in this case
  • cl = "future": you are using one of the future plans and parallelism is defined outside of the pb* call.

Note that on Windows the forking type is not available and pb* functions will fall back to sequential evaluation.

Some examples:

```R f <- function(i) Sys.sleep(1)

sequential

pblapply(1:2, f)

cluster

cl <- parallel::makeCluster(2) pblapply(1:2, f, cl = cl) parallel::stopCluster(cl)

mirai cluster

library(mirai)

-- using the mirai package

cl <- makecluster(2) pblapply(1:2, f, cl = cl) stopcluster(cl)

-- using parallel (requires R >= 2.5)

cl <- parallel::makeCluster(2, type = "MIRAI") pblapply(1:2, f, cl = cl) parallel::stopCluster(cl)

forking

pblapply(1:2, f, cl = 2)

future

library(future)

cl <- parallel::makeCluster(2) plan(cluster, workers = cl) r2 <- pblapply(1:2, f, cl = "future") parallel::stopCluster(cl)

plan(multisession, workers = 2) pblapply(1:2, f, cl = "future")

plan(sequential) ```

Progress with Shiny

```R library(shiny) library(pbapply)

pboptions( type = "shiny", title = "Shiny progress", label = "Almost there ...")

ui <- fluidPage( plotOutput("plot") )

server <- function(input, output, session) { output$plot <- renderPlot({ pbsapply(1:15, function(z) Sys.sleep(0.5)) plot(cars) }) }

shinyApp(ui, server) ```

Owner

  • Name: Peter Solymos
  • Login: psolymos
  • Kind: user
  • Location: Edmonton, Canada

Tech-bio-nerd

GitHub Events

Total
  • Issues event: 1
  • Watch event: 5
  • Push event: 2
Last Year
  • Issues event: 1
  • Watch event: 5
  • Push event: 2

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 1,415
  • Total Committers: 8
  • Avg Commits per committer: 176.875
  • Development Distribution Score (DDS): 0.248
Past Year
  • Commits: 6
  • Committers: 1
  • Avg Commits per committer: 6.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
psolymos p****s@e****b 1,064
Peter Solymos p****s@g****m 322
Zygmunt Zawadzki z****t@g****m 24
olivroy 5****y 1
Phil Chalmers r****s@g****m 1
Dmitry Kryuchkov x****n@g****m 1
stefan7th s****h@e****b 1
Peter Solymos p****s@e****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 54
  • Total pull requests: 18
  • Average time to close issues: 3 months
  • Average time to close pull requests: 11 days
  • Total issue authors: 32
  • Total pull request authors: 6
  • Average comments per issue: 2.54
  • Average comments per pull request: 1.33
  • Merged pull requests: 17
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 4
  • Pull requests: 0
  • Average time to close issues: 8 days
  • Average time to close pull requests: N/A
  • Issue authors: 4
  • Pull request authors: 0
  • Average comments per issue: 0.5
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • psolymos (19)
  • HenrikBengtsson (3)
  • dashaub (2)
  • kendonB (2)
  • svalvaro (1)
  • Kodiologist (1)
  • ramattheis (1)
  • xelibrion (1)
  • jolespin (1)
  • dblodgett-usgs (1)
  • japhir (1)
  • TarasDerevianko (1)
  • KarinSchork (1)
  • jepusto (1)
  • laleoarrow (1)
Pull Request Authors
  • psolymos (11)
  • zzawadz (3)
  • olivroy (2)
  • xelibrion (1)
  • philchalmers (1)
  • dblodgett-usgs (1)
Top Labels
Issue Labels
enhancement (14) new-feature (9) bug (4) question (3)
Pull Request Labels
enhancement (1)

Packages

  • Total packages: 2
  • Total downloads:
    • cran 113,212 last-month
  • Total docker downloads: 884,998
  • Total dependent packages: 293
    (may contain duplicates)
  • Total dependent repositories: 727
    (may contain duplicates)
  • Total versions: 33
  • Total maintainers: 1
cran.r-project.org: pbapply

Adding Progress Bar to '*apply' Functions

  • Versions: 26
  • Dependent Packages: 273
  • Dependent Repositories: 723
  • Downloads: 113,212 Last month
  • Docker Downloads: 884,998
Rankings
Dependent packages count: 0.4%
Dependent repos count: 0.5%
Downloads: 1.4%
Stargazers count: 2.7%
Average: 5.2%
Forks count: 8.8%
Docker downloads count: 17.3%
Maintainers (1)
Last synced: 6 months ago
conda-forge.org: r-pbapply
  • Versions: 7
  • Dependent Packages: 20
  • Dependent Repositories: 4
Rankings
Dependent packages count: 3.2%
Dependent repos count: 16.1%
Average: 25.5%
Stargazers count: 30.4%
Forks count: 52.2%
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.2.0 depends
  • parallel * imports
  • shiny * suggests
.github/workflows/check.yml actions
  • JamesIves/github-pages-deploy-action v4 composite
  • actions/checkout v2 composite
  • r-lib/actions/setup-r v1 composite