Pakman

Pakman: a modular, efficient and portable tool for approximate Bayesian inference - Published in JOSS (2020)

https://github.com/thomaspak/pakman

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    1 of 2 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Scientific Fields

Engineering Computer Science - 60% confidence
Last synced: 4 months ago · JSON representation ·

Repository

A modular, efficient and portable tool for running parallel approximate Bayesian computation algorithms.

Basic Info
  • Host: GitHub
  • Owner: ThomasPak
  • License: bsd-3-clause
  • Language: C++
  • Default Branch: master
  • Size: 980 KB
Statistics
  • Stars: 18
  • Watchers: 5
  • Forks: 3
  • Open Issues: 0
  • Releases: 2
Created about 7 years ago · Last pushed almost 5 years ago
Metadata Files
Readme Contributing License Code of conduct Citation Codemeta

README.md

Pakman

Build Status Documentation DOI

A modular, efficient and portable tool for running parallel approximate Bayesian computation algorithms.

Introduction

Pakman is a software tool for parallel approximate Bayesian computation (ABC) algorithms. Its modular framework is based on user executables, which means that problem-specific tasks, like model simulations, are performed by black box executables supplied to Pakman by the user.

Pakman parallelises the execution of simulations using MPI, a portable standard for distributed computing, and was designed to be lightweight so that a minimal amount of overhead goes into parallelisation.

The problems that will benefit the most from Pakman are those where model simulations take a relatively long time, on the order of seconds or more.

Requirements

Pakman has been tested with OpenMPI and MPICH. Python is not necessary to build Pakman, but it is used in some Pakman examples and to create figures.

Building

To build Pakman, run:

$ mkdir build $ cd build $ cmake .. $ make

This will create a pakman binary under the src subfolder of the build directory.

Testing

To test Pakman, run (in the build folder):

$ ctest

This will run a series of tests to verify that the Pakman build is working correctly.

To test how well Pakman scales with the number of parallel processes employed, run (in the build folder):

$ scaling/run-scaling.sh

This script will benchmark Pakman with a computationally intensive simulator for different numbers of parallel instances of the simulator. The results are saved in the comma-separated file scaling.csv. In addition, if Python was detected by CMake, the speedup and efficiency with respect to the number of processes will be plotted in speedup.png and efficiency.png, respectively.

It is recommended to use the flag -DCMAKE_BUILD_TYPE=Release with the cmake command before running the scaling test to reduce computation time.

Documentation

Examples of how to use Pakman can be found in the folder examples inside the build folder. See the wiki for detailed documentation.

In addition, the command build/src/pakman --help provides a quick reference on how to use Pakman.

Developers: code documentation can be found here.

Testing on multiple nodes

By default, running ctest will only test Pakman on the local node. Since Pakman uses MPI for parallelisation however, it is possible to run Pakman on multiple nodes. In order to run the above tests and examples on the nodes that specific to your setup, you need to pass additional information to CMake.

Firstly, you need to define the CMake variable MPIEXEC_HOSTS_FLAGS to contain the command-line flags you would pass to mpiexec to specify the hosts. Secondly, define MPIEXEC_MAX_NUMPROCS to specify the total number of MPI processes to run.

For example, if you would normally launch 8 MPI processes on node0 and node1 in the following manner:

$ mpiexec --hosts node0,node1 -n 8 ...

Then you need to run CMake as:

... $ cmake .. -DMPIEXEC_HOSTS_FLAGS="--hosts node0,node1" -DMPIEXEC_MAX_NUMPROCS=8 ...

When building Pakman with these flags, CMake will automatically insert the appropriate flags in the mpiexec commands for running tests (including the scaling benchmark) and examples. Thus, running the above commands as before will now test Pakman on multiple nodes.

Contributing

We welcome your contributions! Please see our contributing guidelines.

Owner

  • Name: Thomas Pak
  • Login: ThomasPak
  • Kind: user

JOSS Publication

Pakman: a modular, efficient and portable tool for approximate Bayesian inference
Published
March 07, 2020
Volume 5, Issue 47, Page 1716
Authors
Thomas F. Pak ORCID
Mathematical Institute, University of Oxford
Ruth E. Baker ORCID
Mathematical Institute, University of Oxford
Joe M. Pitt-Francis ORCID
Department of Computer Science, University of Oxford
Editor
Jed Brown ORCID
Tags
MPI approximate Bayesian computation Bayesian inference parallel computing distributed computing

Citation (CITATION.cff)

cff-version: 1.1.0
message: "If you use this software, please cite the article below."
authors:
  - family-names: Pak
    given-names: Thomas F.
    orcid: https://orcid.org/0000-0002-7198-7688
  - family-names: Baker
    given-names: Ruth E.
    orcid: https://orcid.org/0000-0002-6304-9333
  - family-names: Pitt-Francis
    given-names: Joe M.
    orcid: https://orcid.org/0000-0002-5094-5403
title: Pakman
version: 1.1.0
date-released: 2020-03-05
doi: 10.5281/zenodo.3697312
references:
    - type: article
      authors:
          - family-names: Pak
            given-names: Thomas F.
            orcid: https://orcid.org/0000-0002-7198-7688
          - family-names: Baker
            given-names: Ruth E.
            orcid: https://orcid.org/0000-0002-6304-9333
          - family-names: Pitt-Francis
            given-names: Joe M.
            orcid: https://orcid.org/0000-0002-5094-5403
      title: 'Pakman: a modular, efficient and portable tool for approximate Bayesian inference'
      keywords:
        - C++
        - MPI
        - approximate Bayesian computation
        - Bayesian inference
        - parallel computing
        - distributed computing
      date-released: 2020-03-07
      doi: 10.21105/joss.01716
      journal: The Journal of Open Source Software
      volume: 5
      issue: 47
      start: 1716
      url: https://joss.theoj.org/papers/10.21105/joss.01716
      year: 2020

CodeMeta (codemeta.json)

{
  "@context": "https://raw.githubusercontent.com/codemeta/codemeta/master/codemeta.jsonld",
  "@type": "Code",
  "author": [
    {
      "@id": "https://orcid.org/0000-0002-7198-7688",
      "@type": "Person",
      "email": "thomas.pak@maths.ox.ac.uk",
      "name": "Thomas Pak",
      "affiliation": "Mathematical Institute, University of Oxford"
    }
  ],
  "identifier": "https://doi.org/10.5281/zenodo.3357293",
  "codeRepository": "https://github.com/ThomasPak/pakman",
  "datePublished": "2019-08-01",
  "dateModified": "2019-08-01",
  "dateCreated": "2019-08-01",
  "description": "A modular, efficient and portable tool for running parallel approximate Bayesian computation algorithms.",
  "keywords": "C++, MPI, approximate Bayesian computation, Bayesian inference, parallel computing, distributed computing",
  "license": "BSD 3-Clause License",
  "title": "Pakman",
  "version": "v1.0.0"
}

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 417
  • Total Committers: 2
  • Avg Commits per committer: 208.5
  • Development Distribution Score (DDS): 0.002
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Thomas Pak t****k@l****k 416
Jed Brown j****d@j****g 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 1
  • Total pull requests: 1
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 19 days
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 3.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • zfuller5280 (1)
Pull Request Authors
  • jedbrown (1)
Top Labels
Issue Labels
Pull Request Labels