https://github.com/althonos/smatrix

Not the slurm job dispatcher you need, but the one you deserve.

https://github.com/althonos/smatrix

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.4%) to scientific vocabulary

Keywords

cli-app python slurm slurm-cluster
Last synced: 6 months ago · JSON representation

Repository

Not the slurm job dispatcher you need, but the one you deserve.

Basic Info
  • Host: GitHub
  • Owner: althonos
  • License: mit
  • Language: Python
  • Default Branch: master
  • Size: 35.2 KB
Statistics
  • Stars: 1
  • Watchers: 3
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
cli-app python slurm slurm-cluster
Created over 5 years ago · Last pushed over 5 years ago
Metadata Files
Readme License

README.md

smatrix

Not the slurm job dispatcher you need, but the one you deserve.

License Source GitHub issues

Introduction

slurm is a workload manager, and it is typically used to parallelize work at a job level. It is heavily configurable, but can sometimes be quite overwhelming when the work to be performed is simple.

A typical usecase in our lab is to use slurm to process metagenomes in the EMBL cluster: we want to run a single command (like hmmsearch or gecco) on a very large number of files, and also possibly with different threshold values. Doing so efficiently requires writing a custom script that ends up being copied and pasted around. As a programmer, I found this unacceptable.

smatrix leverages the most common tasks of splitting the workload evenly, generating a job script with the parameters, and launching the jobs to the cluster. Think xargs, except it spawns slurm jobs instead of processes.

Usage

smatrix uses the same names as sbatch or srun for parameters if needed, and some additional flags to pass parameters. A quick example:

console $ smatrix --cpus-per-task 2 -P:f1 0.02 0.01 -P:file /data/seq1.fa /data/seq2.fa \ --wrap 'hmmsearch --F1=$f1 Pfam.hmm $file'

This command will launch 4 jobs, using 2 CPUs per job (using the same option as with sbatch), for all possible combinations of $f1 and $file as given in the CLI arguments. --cpus-per-task is a builtin sbatch option, so it will be transparently given to SLURM when we queue the job. The other arguments however are being used by smatrix to setup the job array.

smatrix-hijacked options

--wrap flag

The --wrap CLI flag is used to pass the command to wrap in a script. It will get executed once for every element of the job matrix created with the parameters given to the CLI.

smatrix-specific options

-P / --param flag

The -P flag is the only new flag introduced by smatrix. Use it to specify parameter arrays

The format for the --param flag is designed to accommodate globing and sub-command calls in the shell: console $ smatrix --param:n $(seq 1 100) --param:file /etc/*.conf --wrap '...'

Note that, in this example, the glob pattern expansion is done by the shell and may have escaping issues if the filenames contain whitespace characters.

--wrap flag

--wrap was already there in sbatch, but smatrix wraps the command differently, since it will also expose the parameters you request with -P.

Owner

  • Name: Martin Larralde
  • Login: althonos
  • Kind: user
  • Location: Heidelberg, Germany
  • Company: EMBL / LUMC, @zellerlab

PhD candidate in Bioinformatics, passionate about programming, SIMD-enthusiast, Pythonista, Rustacean. I write poems, and sometimes they are executable.

GitHub Events

Total
Last Year

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 25
  • Total Committers: 1
  • Avg Commits per committer: 25.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Martin Larralde m****e@e****r 25
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels