https://github.com/agnostiqhq/covalent-slurm-plugin

Executor plugin interfacing Covalent with Slurm

https://github.com/agnostiqhq/covalent-slurm-plugin

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.1%) to scientific vocabulary

Keywords

covalent data-pipeline etl hpc hpc-applications machinelearning machinelearning-python parallelization pipelines python python3 quantum-computing quantum-machine-learning slurm workflow workflow-automation

Keywords from Contributors

orchestration quantum workflow-management distributed-computing braket gcp-batch cloud-computing energy-system-model batch microsoft-azure
Last synced: 5 months ago · JSON representation

Repository

Executor plugin interfacing Covalent with Slurm

Basic Info
  • Host: GitHub
  • Owner: AgnostiqHQ
  • License: apache-2.0
  • Language: Python
  • Default Branch: develop
  • Homepage: https://covalent.xyz
  • Size: 366 KB
Statistics
  • Stars: 26
  • Watchers: 12
  • Forks: 6
  • Open Issues: 19
  • Releases: 23
Topics
covalent data-pipeline etl hpc hpc-applications machinelearning machinelearning-python parallelization pipelines python python3 quantum-computing quantum-machine-learning slurm workflow workflow-automation
Created about 4 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog License Code of conduct Codeowners

README.md

 

[![covalent](https://img.shields.io/badge/covalent-0.177.0-purple)](https://github.com/AgnostiqHQ/covalent) [![python](https://img.shields.io/pypi/pyversions/covalent-slurm-plugin)](https://github.com/AgnostiqHQ/covalent-slurm-plugin) [![tests](https://github.com/AgnostiqHQ/covalent-slurm-plugin/actions/workflows/tests.yml/badge.svg)](https://github.com/AgnostiqHQ/covalent-slurm-plugin/actions/workflows/tests.yml) [![codecov](https://codecov.io/gh/AgnostiqHQ/covalent-slurm-plugin/branch/main/graph/badge.svg?token=QNTR18SR5H)](https://codecov.io/gh/AgnostiqHQ/covalent-slurm-plugin) [![apache](https://img.shields.io/badge/License-Apache_License_2.0-blue)](https://www.apache.org/licenses/LICENSE-2.0)

Covalent Slurm Plugin

Covalent is a Pythonic workflow tool used to execute tasks on advanced computing hardware. This executor plugin interfaces Covalent with HPC systems managed by Slurm. For workflows to be deployable, users must have SSH access to the Slurm login node, writable storage space on the remote filesystem, and permissions to submit jobs to Slurm.

Installation

To use this plugin with Covalent, simply install it using pip:

pip install covalent-slurm-plugin

On the remote system, the Python version in the environment you plan to use must match that used when dispatching the calculations. Additionally, the remote system's Python environment must have the base covalent package installed (e.g. pip install covalent).

Usage

The following shows an example of a Covalent configuration that is modified to support Slurm:

```console [executors.slurm] username = "user" address = "login.cluster.org" sshkeyfile = "/home/user/.ssh/idrsa" remoteworkdir = "/scratch/user" cache_dir = "/tmp/covalent"

[executors.slurm.options] nodes = 1 ntasks = 4 cpus-per-task = 8 constraint = "gpu" gpus = 4 qos = "regular"

[executors.slurm.srunoptions] cpubind = "cores" gpus = 4 gpu-bind = "single:1" ```

The first stanza describes default connection parameters for a user who can connect to the Slurm login node using, for example:

console ssh -i /home/user/.ssh/id_rsa user@login.cluster.org

The second and third stanzas describe default parameters for #SBATCH directives and default parameters passed directly to srun, respectively.

This example generates a script containing the following preamble:

console #!/bin/bash #SBATCH --nodes=1 #SBATCH --ntasks=4 #SBATCH --cpus-per-task=8 #SBATCH --constraint=gpu #SBATCH --gpus=4 #SBATCH --qos=regular

and subsequent workflow submission with:

console srun --cpu_bind=cores --gpus=4 --gpu-bind=single:1

To use the configuration settings, an electron’s executor must be specified with a string argument, in this case:

```python import covalent as ct

@ct.electron(executor="slurm") def my_task(x, y): return x + y ```

Alternatively, passing a SlurmExecutor instance enables custom behavior scoped to specific tasks. Here, the executor's prerun_commands and postrun_commands parameters can be used to list shell commands to be executed before and after submitting the workflow. These may include any additional srun commands apart from workflow submission. Commands can also be nested inside the submission call to srun by using the srun_append parameter.

More complex jobs can be crafted by using these optional parameters. For example, the instance below runs a job that accesses CPU and GPU resources on a single node, while profiling GPU usage via nsys and issuing complementary commands that pause/resume the central hardware counter.

```python executor = ct.executor.SlurmExecutor( remoteworkdir="/scratch/user/experiment1", options={ "qos": "regular", "time": "01:30:00", "nodes": 1, "constraint": "gpu", }, preruncommands=[ "module load package/1.2.3", "srun --ntasks-per-node 1 dcgmi profile --pause" ], srunoptions={ "n": 4, "c": 8, "cpu-bind": "cores", "G": 4, "gpu-bind": "single:1" }, srunappend="nsys profile --stats=true -t cuda --gpu-metrics-device=all", postrun_commands=[ "srun --ntasks-per-node 1 dcgmi profile --resume", ] )

@ct.electron(executor=executor) def mycustomtask(x, y): return x + y ```

Here the corresponding submit script contains the following commands:

```console module load package/1.2.3 srun --ntasks-per-node 1 dcgmi profile --pause

srun -n 4 -c 8 --cpu-bind=cores -G 4 --gpu-bind=single:1 \ nsys profile --stats=true -t cuda --gpu-metrics-device=all \ python /scratch/user/experiment1/workflow_script.py

srun --ntasks-per-node 1 dcgmi profile --resume ```

Release Notes

Release notes are available in the Changelog.

Citation

Please use the following citation in any publications:

W. J. Cunningham, S. K. Radha, F. Hasan, J. Kanem, S. W. Neagle, and S. Sanand. Covalent. Zenodo, 2022. https://doi.org/10.5281/zenodo.5903364

License

Covalent is licensed under the Apache License 2.0. See the LICENSE file or contact the support team for more details.

Owner

  • Name: Agnostiq
  • Login: AgnostiqHQ
  • Kind: organization
  • Email: contact@agnostiq.ai
  • Location: Toronto

Developing Software for Advanced Computing

GitHub Events

Total
  • Push event: 5
  • Pull request event: 1
Last Year
  • Push event: 5
  • Pull request event: 1

Committers

Last synced: 11 months ago

All Time
  • Total Commits: 60
  • Total Committers: 12
  • Avg Commits per committer: 5.0
  • Development Distribution Score (DDS): 0.667
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
CovalentOpsBot c****t 20
Will Cunningham w****7 11
Andrew S. Rosen a****3@g****m 7
Venkat Bala v****t@a****i 6
Sankalp Sanand s****p@a****i 4
Ara Ghukasyan 3****s 3
jkanem j****i@a****i 3
Alejandro Esquivel ae@a****d 2
pre-commit-ci[bot] 6****] 1
WingCode s****4@g****m 1
Casey Jao c****y@a****i 1
Scott Wyman Neagle s****t@a****i 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 38
  • Total pull requests: 58
  • Average time to close issues: 16 days
  • Average time to close pull requests: 14 days
  • Total issue authors: 13
  • Total pull request authors: 15
  • Average comments per issue: 1.05
  • Average comments per pull request: 1.57
  • Merged pull requests: 40
  • Bot issues: 0
  • Bot pull requests: 4
Past Year
  • Issues: 1
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 1.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • arosen93 (11)
  • Andrew-S-Rosen (5)
  • jackbaker1001 (4)
  • kessler-frost (3)
  • venkatBala (3)
  • wjcunningham7 (2)
  • jkanem (2)
  • mlpgwdg (1)
  • cjao (1)
  • araghukas (1)
  • CCSun21 (1)
  • santoshkumarradha (1)
  • svandenhaute (1)
Pull Request Authors
  • arosen93 (10)
  • wjcunningham7 (7)
  • cjao (6)
  • kessler-frost (6)
  • venkatBala (6)
  • araghukas (4)
  • jkanem (4)
  • AlejandroEsquivel (2)
  • pre-commit-ci[bot] (2)
  • rajarshitiwari (2)
  • dependabot[bot] (2)
  • Emmanuel289 (2)
  • scottwn (2)
  • WingCode (1)
Top Labels
Issue Labels
feature (9) bug :bug: (5) covalent-os (2) help-wanted (1) devops (1) priority/high (1) testing / integration tests (1) testing / unit tests (1)
Pull Request Labels
feature (3) bug :bug: (2) refactor / small (2) improvements / style (2) dependencies (2) testing (1) testing / unit tests (1) github_actions (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 306 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 22
  • Total maintainers: 1
pypi.org: covalent-slurm-plugin

Covalent Slurm Plugin

  • Versions: 22
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 306 Last month
Rankings
Dependent packages count: 10.1%
Stargazers count: 12.1%
Downloads: 13.1%
Average: 14.2%
Forks count: 14.2%
Dependent repos count: 21.5%
Maintainers (1)
Last synced: 6 months ago

Dependencies

requirements.txt pypi
  • aiofiles ==0.8.0
tests/requirements.txt pypi
  • flake8 ==3.9.2 test
  • isort ==5.7.0 test
  • mock ==4.0.3 test
  • nbconvert ==6.3.0 test
  • pre-commit ==2.13.0 test
  • pytest ==6.2.5 test
  • pytest-asyncio ==0.18.3 test
  • pytest-cov ==2.12.0 test
  • pytest-mock ==3.6.1 test
.github/workflows/changelog.yml actions
  • EndBug/add-and-commit v9 composite
  • actions/checkout v3 composite
.github/workflows/changelog_reminder.yml actions
  • actions/checkout master composite
  • peterjgrainger/action-changelog-reminder v1.3.0 composite
.github/workflows/license.yml actions
  • actions/checkout v3 composite
  • pilosus/action-pip-license-checker * composite
.github/workflows/release.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • ncipollo/release-action v1 composite
.github/workflows/tests.yml actions
  • actions-ecosystem/action-get-latest-tag v1 composite
  • actions/checkout v3 composite
  • actions/setup-python v2 composite
  • codecov/codecov-action v3 composite
.github/workflows/version.yml actions
  • actions/checkout v1 composite
  • tj-actions/changed-files v18.4 composite