suresoft-hpc-workflow
Official Mirror of https://git.rz.tu-bs.de/soe.peters/suresoft-hpc-workflow
Science Score: 75.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
✓Institutional organization owner
Organization tubs-suresoft has institutional domain (www.tu-braunschweig.de) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary
Repository
Official Mirror of https://git.rz.tu-bs.de/soe.peters/suresoft-hpc-workflow
Basic Info
Statistics
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
SURESOFT HPC Workflow
Introduction
This workflow shows how to automatically build and deploy containers to clusters and verify the results using continuous integration. It is developed as a showcase for the SURESOFT workflow addressing reproducibility on HPC platforms. The project includes a sample application of a 2D Laplace heat transfer in a plate.
Workflow
The workflow is grouped into four stages in the Continuous Integration pipeline using GitLab CI (see image below).
build
- builds containers with different MPI implementations (see Singularity Images) using Singularity
- Rocky Linux with MPICH using the hybrid model. Definition File: rockylinux9-mpich.def
- Rocky Linux with MPICH using the bind model. Definition File: rockylinux9-mpich-bind.def
- Rocky Linux with OpenMPI using the hybrid model. Definition File: rockylinux9-openmpi.def
- builds containers with different MPI implementations (see Singularity Images) using Singularity
simulation
- runs the image with MPI bind model on the cluster using hpc-rocket
- deploys the container to the cluster via SSH
- executes the container (e.g. via SLURM)
- returns a defined set of files as the result
- runs the image with MPI bind model on the cluster using hpc-rocket
test
- Runs a regression test with fieldcompare to compare the results of the simulation stage with reference data.
- benchmark
- Dynamically generates additional CI jobs to benchmark the performance of the different MPI approaches.
Singularity Images
The .def files in the Containers directory define Singularity images using different MPI implementations and binding approaches. The singularity files are based on rockylinux 9 as the targeted remote system uses CentOS Linux 7.
All .def files are separated into two stages, a build and a runtime stage.
The build stage is used to compile the application, while the runtime stage only contains the dependencies necessary to run it.
This reduces size of the final image.
rockylinux9-mpich.def and rockylinux9-openmpi.def use the hybrid model where MPI is installed on the host machine as well as inside the container.
When running, the MPI on the host machine will communicate with the MPI instance inside the container.
In practice this leads to a small performance overhead in comparison to a native solution.
rockylinux9-mpich-bind.def uses the bind model where no MPI instance is installed in the container.
Instead the MPI installation of the host machine is mounted into the container.
This results in a performance on par with a native solution.
However, the portability of the container is reduced, since the application must be compiled with the same MPI version that is used on the host machine.
Prerequisite
The first CI-job, which builds the container, requires a GitLab Runner using a privileged Docker Executor. This is necessary because it uses a docker image to build the singularity container. However, this is not needed if the container already exists.
HPC Rocket
HPC Rocket is a commandline tool to send slurm commands to a remote machine and monitor the job progress. It was primarily written to launch slurm jobs from a CI pipeline.
rocket.yml
- defines files to copy to cluster
- defines result files to copy back to gitlab
- defines slurm job file to submit
laplace.job
- slurm settings
- executes singularity image
Fieldcompare
fieldcompare is a Python package with command-line interface (CLI) that can be used to compare
datasets for (fuzzy) equality. It was designed mainly to serve as a tool to realize regression tests
for research software, and in particular research software that deals with numerical simulations.
In regression tests, the output of a software is compared to reference data that was produced by
the same software at an earlier time, in order to detect if changes to the code cause unexpected
changes to the behavior of the software.
We use fieldcompare to compare the temperature field of the the 2d Laplace simulation with a predefined reference dataset.
Benchmarks
matplot
dynamic CI pipeline
jinja templates
- slurmjob
- rocket files
- CI jobs
Owner
- Name: Suresoft
- Login: TUBS-Suresoft
- Kind: organization
- Website: https://www.tu-braunschweig.de/suresoft
- Repositories: 2
- Profile: https://github.com/TUBS-Suresoft
Sustainable Research Software
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
type: software
authors:
- family-names: Peters
given-names: Sören
orcid: https://orcid.org/0000-0001-5236-3776
- family-names: Marcus
given-names: Sven
orcid: https://orcid.org/0000-0003-3689-2162
- family-names: Linxweiler
given-names: Jan
orcid: https://orcid.org/0000-0002-2755-5087
title: "SURESOFT HPC workflow"
version: 0.1.0
doi: 10.5281/zenodo.7568959
license: MIT
repository-code: "https://git.rz.tu-bs.de/soe.peters/suresoft-hpc-workflow"
date-released: "2023-01-25"
references:
- title: Singularity
authors:
- family-names: Kurtzer
given-names: Gregory M.
type: software
license: BSD-3-Clause
url: https://github.com/apptainer/singularity
doi: 10.5281/zenodo.1310023
- title: hpc-rocket
authors:
- family-names: Marcus
given-names: Sven
orcid: https://orcid.org/0000-0003-3689-2162
type: software
license: MIT
url: https://github.com/SvenMarcus/hpc-rocket
doi: 10.5281/zenodo.7355862
- title: fieldcompare
authors:
- family-names: Gläser
given-names: Dennis
orcid: https://orcid.org/0000-0001-9646-881X
type: software
license: GPL-3.0-only
url: https://gitlab.com/dglaeser/fieldcompare
GitHub Events
Total
Last Year
Dependencies
- Jinja2 ==3.1.2
- matplotlib ==3.6.3
- rich ==13.0.1

