batchtools

batchtools: Tools for R to work on batch systems - Published in JOSS (2017)

https://github.com/mlr-org/batchtools

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 10 DOI reference(s) in README
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    3 of 18 committers (16.7%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.0%) to scientific vocabulary

Keywords

batchexperiments batchjobs cran docker-swarm high-performance-computing hpc hpc-clusters lsf openlava parallel-computing r reproducibility sge slurm torque

Keywords from Contributors

mlr3
Last synced: 6 months ago · JSON representation

Repository

Tools for computation on batch systems

Basic Info
Statistics
  • Stars: 184
  • Watchers: 11
  • Forks: 52
  • Open Issues: 90
  • Releases: 14
Topics
batchexperiments batchjobs cran docker-swarm high-performance-computing hpc hpc-clusters lsf openlava parallel-computing r reproducibility sge slurm torque
Created over 10 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog License

README.Rmd

---
output: github_document
---

# batchtools

Package website: [release](https://batchtools.mlr-org.com/) | [dev](https://batchtools.mlr-org.com/dev/)



[![JOSS Publication](https://joss.theoj.org/papers/10.21105/joss.00135/status.svg)](https://doi.org/10.21105/joss.00135)
[![r-cmd-check](https://github.com/mlr-org/batchtools/actions/workflows/r-cmd-check.yml/badge.svg)](https://github.com/mlr-org/batchtools/actions/workflows/r-cmd-check.yml)
[![CRAN Status](https://www.r-pkg.org/badges/version-ago/batchtools)](https://cran.r-project.org/package=batchtools)
[![Mattermost](https://img.shields.io/badge/chat-mattermost-orange.svg)](https://lmmisld-lmu-stats-slds.srv.mwn.de/mlr_invite/)


As a successor of the packages [BatchJobs](https://github.com/tudo-r/BatchJobs) and [BatchExperiments](https://github.com/tudo-r/Batchexperiments), batchtools provides a parallel implementation of Map for high performance computing systems managed by schedulers like Slurm, Sun Grid Engine, OpenLava, TORQUE/OpenPBS, Load Sharing Facility (LSF) or Docker Swarm (see the setup section in the [vignette](https://batchtools.mlr-org.com/articles/batchtools.html)).

Main features:

* Convenience: All relevant batch system operations (submitting, listing, killing) are either handled internally or abstracted via simple R functions
* Portability: With a well-defined interface, the source is independent from the underlying batch system - prototype locally, deploy on any high performance cluster
* Reproducibility: Every computational part has an associated seed stored in a data base which ensures reproducibility even when the underlying batch system changes
* Abstraction: The code layers for algorithms, experiment definitions and execution are cleanly separated and allow to write readable and maintainable code to manage large scale computer experiments


## Installation

Install the stable version from CRAN:
```{R, eval = FALSE}
install.packages("batchtools")
```
For the development version, use [devtools](https://cran.r-project.org/package=devtools):

```{R, eval = FALSE}
devtools::install_github("mlr-org/batchtools")
```

Next, you need to setup `batchtools` for your HPC (it will run sequentially otherwise).
See the [vignette](https://batchtools.mlr-org.com/articles/batchtools.html) for instructions.

## Why batchtools?
The development of [BatchJobs](https://github.com/tudo-r/BatchJobs/) and [BatchExperiments](https://github.com/tudo-r/Batchexperiments) is discontinued for the following reasons:

* Maintainability: The packages [BatchJobs](https://github.com/tudo-r/BatchJobs/) and [BatchExperiments](https://github.com/tudo-r/Batchexperiments) are tightly connected which makes maintenance difficult. Changes have to be synchronized and tested against the current CRAN versions for compatibility. Furthermore, BatchExperiments violates CRAN policies by calling internal functions of BatchJobs.
* Data base issues: Although we invested weeks to mitigate issues with locks of the SQLite data base or file system (staged queries, file system timeouts, ...), `BatchJobs` kept working unreliable on some systems with high latency under certain conditions. This made `BatchJobs` unusable for many users.

[BatchJobs](https://github.com/tudo-r/BatchJobs/) and [BatchExperiments](https://github.com/tudo-r/Batchexperiments) will remain on CRAN, but new features are unlikely to be ported back.
The [vignette](https://batchtools.mlr-org.com/articles/batchtools.html) contains a section comparing the packages.


## Resources
* [Function reference](https://batchtools.mlr-org.com/reference/)
* [Vignette](https://batchtools.mlr-org.com/articles/batchtools.html)
* [JOSS Paper](https://doi.org/10.21105/joss.00135): Short paper on batchtools. Please cite this if you use batchtools.
* [Paper on BatchJobs/BatchExperiments](https://www.jstatsoft.org/v64/i11): The described concept still holds for batchtools and most examples work analogously (see the [vignette](https://batchtools.mlr-org.com/articles/batchtools.html) for differences between the packages).

## Citation
Please cite the [JOSS paper](https://doi.org/10.21105/joss.00135) using the following BibTeX entry:
```
@article{,
  doi = {10.21105/joss.00135},
  url = {https://doi.org/10.21105/joss.00135},
  year  = {2017},
  month = {feb},
  publisher = {The Open Journal},
  volume = {2},
  number = {10},
  author = {Michel Lang and Bernd Bischl and Dirk Surmann},
  title = {batchtools: Tools for R to work on batch systems},
  journal = {The Journal of Open Source Software}
}
```

## Related Software
* The [High Performance Computing Task View](https://cran.r-project.org/view=HighPerformanceComputing) lists the most relevant packages for scientific computing with R.
* [clustermq](https://cran.r-project.org/package=clustermq) is a similar approach which also supports multiple schedulers. Uses the ZeroMQ network protocol for communication, and shines if you have millions of fast jobs.
* [batch](https://cran.r-project.org/package=batch) assists in splitting and submitting jobs to LSF and MOSIX clusters.
* [flowr](https://cran.r-project.org/package=flowr) supports LSF, Slurm, TORQUE and Moab and provides a scatter-gather approach to define computational jobs.
* [future.batchtools](https://cran.r-project.org/package=future.batchtools) implements `batchtools` as backend for [future](https://cran.r-project.org/package=future.batchtools).
* [doFuture](https://cran.r-project.org/package=doFuture) together with [future.batchtools](https://cran.r-project.org/package=future.batchtools) connects `batchtools` to [foreach](https://cran.r-project.org/package=foreach).
* [drake](https://cran.r-project.org/package=drake) uses graphs to define computational jobs. `batchtools` is used as a backend via [future.batchtools](https://cran.r-project.org/package=future.batchtools).

## Contributing to batchtools
This R package is licensed under the [LGPL-3](https://www.gnu.org/licenses/lgpl-3.0.en.html).
If you encounter problems using this software (lack of documentation, misleading or wrong documentation, unexpected behaviour, bugs, ...) or just want to suggest features, please open an issue in the [issue tracker](https://github.com/mlr-org/batchtools/issues).
Pull requests are welcome and will be included at the discretion of the author.
If you have customized a template file for your (larger) computing site, please share it: fork the repository, place your template in `inst/templates` and send a pull request.

Owner

  • Name: mlr-org
  • Login: mlr-org
  • Kind: organization
  • Location: Munich, Germany

GitHub Events

Total
  • Create event: 5
  • Release event: 1
  • Issues event: 2
  • Watch event: 3
  • Delete event: 4
  • Issue comment event: 8
  • Push event: 30
  • Pull request event: 18
  • Fork event: 1
Last Year
  • Create event: 5
  • Release event: 1
  • Issues event: 2
  • Watch event: 3
  • Delete event: 4
  • Issue comment event: 8
  • Push event: 30
  • Pull request event: 18
  • Fork event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 928
  • Total Committers: 18
  • Avg Commits per committer: 51.556
  • Development Distribution Score (DDS): 0.073
Past Year
  • Commits: 11
  • Committers: 2
  • Avg Commits per committer: 5.5
  • Development Distribution Score (DDS): 0.091
Top Committers
Name Email Commits
Michel Lang m****g@g****m 860
Dirk Surmann s****n@s****e 33
Sebastian Fischer s****r@g****m 10
Dotterbart d****k@t****e 7
Bernd Bischl b****l@g****t 3
Stuart Russell s****t@d****g 2
Jakob Richter c****e@j****e 2
Ista Zahn i****n@h****u 1
Peter Haverty p****y@g****m 1
Timothée Flutre t****e@i****r 1
Arfon Smith a****n 1
Chris Hammill c****l@g****m 1
Guillermo Luque g****s@g****m 1
Marc Becker 3****c 1
Marvin N. Wright g****b@w****e 1
Michael Chirico m****4@g****m 1
Nathan Sheffield n****f 1
ja-thomas j****s 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 85
  • Total pull requests: 23
  • Average time to close issues: about 2 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 57
  • Total pull request authors: 13
  • Average comments per issue: 2.65
  • Average comments per pull request: 0.83
  • Merged pull requests: 9
  • Bot issues: 0
  • Bot pull requests: 3
Past Year
  • Issues: 4
  • Pull requests: 8
  • Average time to close issues: N/A
  • Average time to close pull requests: about 3 hours
  • Issue authors: 2
  • Pull request authors: 3
  • Average comments per issue: 0.25
  • Average comments per pull request: 0.38
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 3
Top Authors
Issue Authors
  • mb706 (9)
  • HenrikBengtsson (6)
  • nick-youngblut (4)
  • kokyriakidis (3)
  • mllg (3)
  • mtmorgan (3)
  • stuvet (2)
  • ryananeff (2)
  • tdhock (2)
  • multimeric (2)
  • chim3y (2)
  • rrichmond (2)
  • lpiep (2)
  • sumny (1)
  • myoung3 (1)
Pull Request Authors
  • sebffischer (8)
  • dependabot[bot] (6)
  • stuvet (5)
  • mllg (3)
  • bwcompton (2)
  • jakob-r (2)
  • HenrikBengtsson (1)
  • aaronpeikert (1)
  • tdhock (1)
  • ja-thomas (1)
  • damirpolat (1)
  • MichaelChirico (1)
  • reikoch (1)
  • izahn (1)
  • olivroy (1)
Top Labels
Issue Labels
enhancement (6) help wanted (2) question (1)
Pull Request Labels
dependencies (6) github_actions (6)

Packages

  • Total packages: 1
  • Total downloads:
    • cran 2,864 last-month
  • Total docker downloads: 95,759
  • Total dependent packages: 12
  • Total dependent repositories: 38
  • Total versions: 18
  • Total maintainers: 1
cran.r-project.org: batchtools

Tools for Computation on Batch Systems

  • Versions: 18
  • Dependent Packages: 12
  • Dependent Repositories: 38
  • Downloads: 2,864 Last month
  • Docker Downloads: 95,759
Rankings
Forks count: 1.4%
Stargazers count: 2.5%
Dependent repos count: 4.2%
Dependent packages count: 4.6%
Downloads: 7.5%
Average: 7.7%
Docker downloads count: 25.7%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.0.0 depends
  • R6 * imports
  • backports >= 1.1.2 imports
  • base64url >= 1.1 imports
  • brew * imports
  • checkmate >= 1.8.5 imports
  • data.table >= 1.11.2 imports
  • digest >= 0.6.9 imports
  • fs >= 1.2.0 imports
  • parallel * imports
  • progress >= 1.1.1 imports
  • rappdirs * imports
  • stats * imports
  • stringi * imports
  • utils * imports
  • withr >= 2.0.0 imports
  • debugme * suggests
  • doMPI * suggests
  • doParallel * suggests
  • e1071 * suggests
  • foreach * suggests
  • future * suggests
  • future.batchtools * suggests
  • knitr * suggests
  • parallelMap * suggests
  • ranger * suggests
  • rmarkdown * suggests
  • rpart * suggests
  • snow * suggests
  • testthat * suggests
  • tibble * suggests
.github/workflows/rcmdcheck.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/upload-artifact main composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v1 composite