Executorlib – Up-scaling Python workflows for hierarchical heterogenous high-performance computing
Executorlib – Up-scaling Python workflows for hierarchical heterogenous high-performance computing - Published in JOSS (2025)
Science Score: 100.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
✓Committers with academic emails
1 of 7 committers (14.3%) from academic institutions -
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Keywords from Contributors
Repository
Up-scale python functions for high performance computing (HPC)
Basic Info
- Host: GitHub
- Owner: pyiron
- License: bsd-3-clause
- Language: Python
- Default Branch: main
- Homepage: https://executorlib.readthedocs.io
- Size: 3.05 MB
Statistics
- Stars: 45
- Watchers: 6
- Forks: 3
- Open Issues: 30
- Releases: 78
Topics
Metadata Files
README.md
executorlib
Up-scale python functions for high performance computing (HPC) with executorlib.
Key Features
- Up-scale your Python functions beyond a single computer. - executorlib extends the Executor interface from the Python standard library and combines it with job schedulers for high performance computing (HPC) including the Simple Linux Utility for Resource Management (SLURM) and flux. With this combination executorlib allows users to distribute their Python functions over multiple compute nodes.
- Parallelize your Python program one function at a time - executorlib allows users to assign dedicated computing resources like CPU cores, threads or GPUs to one Python function call at a time. So you can accelerate your Python code function by function.
- Permanent caching of intermediate results to accelerate rapid prototyping - To accelerate the development of machine learning pipelines and simulation workflows executorlib provides optional caching of intermediate results for iterative development in interactive environments like jupyter notebooks.
Examples
The Python standard library provides the Executor interface with the ProcessPoolExecutor and the ThreadPoolExecutor for parallel execution of Python functions on a single computer. executorlib extends this functionality to distribute Python functions over multiple computers within a high performance computing (HPC) cluster. This can be either achieved by submitting each function as individual job to the HPC job scheduler with an HPC Cluster Executor - or by requesting a job from the HPC cluster and then distribute the Python functions within this job with an HPC Job Executor. Finally, to accelerate the development process executorlib also provides a Single Node Executor - to use the executorlib functionality on a laptop, workstation or single compute node for testing. Starting with the Single Node Executor: ```python from executorlib import SingleNodeExecutor
with SingleNodeExecutor() as exe:
futurelst = [exe.submit(sum, [i, i]) for i in range(1, 5)]
print([f.result() for f in futurelst])
In the same way executorlib can also execute Python functions which use additional computing resources, like multiple
CPU cores, CPU threads or GPUs. For example if the Python function internally uses the Message Passing Interface (MPI)
via the [mpi4py](https://mpi4py.readthedocs.io) Python libary:
python
from executorlib import SingleNodeExecutor
def calc(i): from mpi4py import MPI
size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
return i, size, rank
with SingleNodeExecutor() as exe:
fs = exe.submit(calc, 3, resourcedict={"cores": 2})
print(fs.result())
``
The additionalresourcedictparameter defines the computing resources allocated to the execution of the submitted
Python function. In addition to the compute corescores, the resource dictionary can also define the threads per core
asthreadspercore, the GPUs per core asgpuspercore, the working directory withcwd, the option to use the
OpenMPI oversubscribe feature withopenmpioversubscribeand finally for the [Simple Linux Utility for Resource
Management (SLURM)](https://slurm.schedmd.com) queuing system the option to provide additional command line arguments
with theslurmcmd_args` parameter - resource dictionary
This flexibility to assign computing resources on a per-function-call basis simplifies the up-scaling of Python programs.
Only the part of the Python functions which benefit from parallel execution are implemented as MPI parallel Python
funtions, while the rest of the program remains serial.
The same function can be submitted to the SLURM job scheduler by replacing the
SingleNodeExecutor with the SlurmClusterExecutor. The rest of the example remains the same, which highlights how
executorlib accelerates the rapid prototyping and up-scaling of HPC Python programs.
```python
from executorlib import SlurmClusterExecutor
def calc(i): from mpi4py import MPI
size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
return i, size, rank
with SlurmClusterExecutor() as exe:
fs = exe.submit(calc, 3, resource_dict={"cores": 2})
print(fs.result())
``
In this case the [Python simple queuing system adapter (pysqa)](https://pysqa.readthedocs.io) is used to submit the
calc()` function to the SLURM job scheduler and request an allocation with two CPU cores
for the execution of the function - HPC Cluster Executor. In the background the sbatch
command is used to request the allocation to execute the Python function.
Within a given SLURM job executorlib can also be used to assign a subset of the available computing resources to execute a given Python function. In terms of the SLURM commands, this functionality internally uses the srun command to receive a subset of the resources of a given queuing system allocation. ```python from executorlib import SlurmJobExecutor
def calc(i): from mpi4py import MPI
size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
return i, size, rank
with SlurmJobExecutor() as exe: fs = exe.submit(calc, 3, resource_dict={"cores": 2}) print(fs.result()) ``` In addition, to support for SLURM executorlib also provides support for the hierarchical flux job scheduler. The flux job scheduler is developed at Larwence Livermore National Laboratory to address the needs for the up-coming generation of Exascale computers. Still even on traditional HPC clusters the hierarchical approach of the flux is beneficial to distribute hundreds of tasks within a given allocation. Even when SLURM is used as primary job scheduler of your HPC, it is recommended to use SLURM with flux as hierarchical job scheduler within the allocations.
Documentation
Owner
- Name: pyiron
- Login: pyiron
- Kind: organization
- Website: http://pyiron.org
- Twitter: pyiron
- Repositories: 34
- Profile: https://github.com/pyiron
pyiron - an integrated development environment (IDE) for materials science.
JOSS Publication
Executorlib – Up-scaling Python workflows for hierarchical heterogenous high-performance computing
Authors
Tags
High Performance Computing Task SchedulingCitation (CITATION.cff)
cff-version: "1.2.0"
authors:
- family-names: Janssen
given-names: Jan
orcid: "https://orcid.org/0000-0001-9948-7119"
- family-names: Taylor
given-names: Michael Gilbert
orcid: "https://orcid.org/0000-0003-4327-2746"
- family-names: Yang
given-names: Ping
orcid: "https://orcid.org/0000-0003-4726-2860"
- family-names: Neugebauer
given-names: Joerg
orcid: "https://orcid.org/0000-0002-7903-2472"
- family-names: Perez
given-names: Danny
orcid: "https://orcid.org/0000-0003-3028-5249"
doi: 10.5281/zenodo.15121422
message: If you use this software, please cite our article in the
Journal of Open Source Software.
preferred-citation:
authors:
- family-names: Janssen
given-names: Jan
orcid: "https://orcid.org/0000-0001-9948-7119"
- family-names: Taylor
given-names: Michael Gilbert
orcid: "https://orcid.org/0000-0003-4327-2746"
- family-names: Yang
given-names: Ping
orcid: "https://orcid.org/0000-0003-4726-2860"
- family-names: Neugebauer
given-names: Joerg
orcid: "https://orcid.org/0000-0002-7903-2472"
- family-names: Perez
given-names: Danny
orcid: "https://orcid.org/0000-0003-3028-5249"
date-published: 2025-04-01
doi: 10.21105/joss.07782
issn: 2475-9066
issue: 108
journal: Journal of Open Source Software
publisher:
name: Open Journals
start: 7782
title: Executorlib -- Up-scaling Python workflows for hierarchical
heterogenous high-performance computing
type: article
url: "https://joss.theoj.org/papers/10.21105/joss.07782"
volume: 10
title: Executorlib -- Up-scaling Python workflows for hierarchical
heterogenous high-performance computing
CodeMeta (codemeta.json)
{
"@context": "https://raw.githubusercontent.com/codemeta/codemeta/master/codemeta.jsonld",
"@type": "Code",
"author": [
{
"@id": "https://orcid.org/0000-0001-9948-7119",
"@type": "Person",
"email": "j.janssen@mpi-susmat.de",
"name": "Jan Janssen",
"affiliation": "Max Planck Institute for Sustainable Materials, Düsseldorf, Germany"
},
{
"@id": "https://orcid.org/0000-0003-4327-2746",
"@type": "Person",
"email": "mgt16@lanl.gov",
"name": "Michael Gilbert Taylor",
"affiliation": "Los Alamos National Laboratory, Los Alamos, NM, United States of America"
},
{
"@id": "https://orcid.org/0000-0003-4726-2860",
"@type": "Person",
"email": "pyang@lanl.gov",
"name": "Ping Yang",
"affiliation": "Los Alamos National Laboratory, Los Alamos, NM, United States of America"
},
{
"@id": "https://orcid.org/0000-0002-7903-2472",
"@type": "Person",
"email": "j.neugebauer@mpi-susmat.de",
"name": "Joerg Neugebauer",
"affiliation": "Max Planck Institute for Sustainable Materials, Düsseldorf, Germany"
},
{
"@id": "https://orcid.org/0000-0003-3028-5249",
"@type": "Person",
"email": "danny_perez@lanl.gov",
"name": "Danny Perez",
"affiliation": "Los Alamos National Laboratory, Los Alamos, NM, United States of America"
}
],
"identifier": "",
"codeRepository": "https://github.com/pyiron/executorlib",
"datePublished": "2025-02-14",
"dateModified": "2025-02-14",
"dateCreated": "2025-02-14",
"description": "Up-scale python functions for high performance computing (HPC) with executorlib.",
"keywords": "Python, High Performance Computing, Task Scheduling",
"license": "BSD",
"title": "executorlib",
"version": "0.3.0"
}
GitHub Events
Total
- Create event: 273
- Issues event: 98
- Release event: 22
- Watch event: 25
- Delete event: 252
- Issue comment event: 496
- Push event: 1,098
- Pull request review comment event: 278
- Pull request review event: 361
- Pull request event: 541
- Fork event: 1
Last Year
- Create event: 273
- Issues event: 98
- Release event: 22
- Watch event: 25
- Delete event: 252
- Issue comment event: 496
- Push event: 1,098
- Pull request review comment event: 278
- Pull request review event: 361
- Pull request event: 541
- Fork event: 1
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Jan Janssen | j****n | 1,100 |
| pre-commit-ci[bot] | 6****] | 108 |
| dependabot[bot] | 4****] | 64 |
| pyironrunner | p****n@m****e | 32 |
| liamhuber | l****r@g****m | 26 |
| samwaseda | o****a@m****e | 2 |
| James Corbett | c****8@l****v | 2 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 72
- Total pull requests: 451
- Average time to close issues: about 1 month
- Average time to close pull requests: 1 day
- Total issue authors: 8
- Total pull request authors: 6
- Average comments per issue: 0.63
- Average comments per pull request: 1.68
- Merged pull requests: 343
- Bot issues: 0
- Bot pull requests: 135
Past Year
- Issues: 66
- Pull requests: 430
- Average time to close issues: 13 days
- Average time to close pull requests: 1 day
- Issue authors: 6
- Pull request authors: 6
- Average comments per issue: 0.67
- Average comments per pull request: 1.7
- Merged pull requests: 327
- Bot issues: 0
- Bot pull requests: 120
Top Authors
Issue Authors
- jan-janssen (63)
- liamhuber (8)
- pmrv (1)
- srmnitc (1)
- samwaseda (1)
- ltalirz (1)
- lwshanbd (1)
- svchb (1)
Pull Request Authors
- jan-janssen (326)
- pre-commit-ci[bot] (76)
- dependabot[bot] (68)
- liamhuber (2)
- danielskatz (2)
- samwaseda (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 2
-
Total downloads:
- pypi 10,885 last-month
-
Total dependent packages: 6
(may contain duplicates) -
Total dependent repositories: 0
(may contain duplicates) - Total versions: 81
- Total maintainers: 2
pypi.org: pympipool
Scale serial and MPI-parallel python functions over hundreds of compute nodes all from within a jupyter notebook or serial python process.
- Homepage: https://github.com/pyiron/executorlib
- Documentation: https://executorlib.readthedocs.io
- License: BSD License
-
Latest release: 0.9.1
published over 1 year ago
Rankings
Maintainers (2)
pypi.org: executorlib
Up-scale python functions for high performance computing (HPC) with executorlib.
- Homepage: https://github.com/pyiron/executorlib
- Documentation: https://executorlib.readthedocs.io
- License: BSD 3-Clause License Copyright (c) 2022, Jan Janssen All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
Latest release: 1.6.2
published 4 months ago
