suresoft-hpc-workflow

Official Mirror of https://git.rz.tu-bs.de/soe.peters/suresoft-hpc-workflow

https://github.com/tubs-suresoft/suresoft-hpc-workflow

Science Score: 75.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
    Organization tubs-suresoft has institutional domain (www.tu-braunschweig.de)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.9%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Official Mirror of https://git.rz.tu-bs.de/soe.peters/suresoft-hpc-workflow

Basic Info
  • Host: GitHub
  • Owner: TUBS-Suresoft
  • License: mit
  • Language: C++
  • Default Branch: main
  • Homepage:
  • Size: 15.7 MB
Statistics
  • Stars: 2
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created over 3 years ago · Last pushed almost 3 years ago
Metadata Files
Readme License Citation

README.md

SURESOFT HPC Workflow

Matrix pipeline status DOI

Introduction

This workflow shows how to automatically build and deploy containers to clusters and verify the results using continuous integration. It is developed as a showcase for the SURESOFT workflow addressing reproducibility on HPC platforms. The project includes a sample application of a 2D Laplace heat transfer in a plate.

Workflow

The workflow is grouped into four stages in the Continuous Integration pipeline using GitLab CI (see image below).

  1. build

  2. simulation

    • runs the image with MPI bind model on the cluster using hpc-rocket
      1. deploys the container to the cluster via SSH
      2. executes the container (e.g. via SLURM)
      3. returns a defined set of files as the result
  3. test

    • Runs a regression test with fieldcompare to compare the results of the simulation stage with reference data.

  1. benchmark
    • Dynamically generates additional CI jobs to benchmark the performance of the different MPI approaches.

Singularity Images

The .def files in the Containers directory define Singularity images using different MPI implementations and binding approaches. The singularity files are based on rockylinux 9 as the targeted remote system uses CentOS Linux 7. All .def files are separated into two stages, a build and a runtime stage. The build stage is used to compile the application, while the runtime stage only contains the dependencies necessary to run it. This reduces size of the final image. rockylinux9-mpich.def and rockylinux9-openmpi.def use the hybrid model where MPI is installed on the host machine as well as inside the container. When running, the MPI on the host machine will communicate with the MPI instance inside the container. In practice this leads to a small performance overhead in comparison to a native solution. rockylinux9-mpich-bind.def uses the bind model where no MPI instance is installed in the container. Instead the MPI installation of the host machine is mounted into the container. This results in a performance on par with a native solution. However, the portability of the container is reduced, since the application must be compiled with the same MPI version that is used on the host machine.

Prerequisite

The first CI-job, which builds the container, requires a GitLab Runner using a privileged Docker Executor. This is necessary because it uses a docker image to build the singularity container. However, this is not needed if the container already exists.

HPC Rocket

HPC Rocket is a commandline tool to send slurm commands to a remote machine and monitor the job progress. It was primarily written to launch slurm jobs from a CI pipeline.

rocket.yml

  • defines files to copy to cluster
  • defines result files to copy back to gitlab
  • defines slurm job file to submit

laplace.job

  • slurm settings
  • executes singularity image

Fieldcompare

fieldcompare is a Python package with command-line interface (CLI) that can be used to compare datasets for (fuzzy) equality. It was designed mainly to serve as a tool to realize regression tests for research software, and in particular research software that deals with numerical simulations. In regression tests, the output of a software is compared to reference data that was produced by the same software at an earlier time, in order to detect if changes to the code cause unexpected changes to the behavior of the software.

We use fieldcompare to compare the temperature field of the the 2d Laplace simulation with a predefined reference dataset.

Benchmarks

  • matplot

  • dynamic CI pipeline

  • jinja templates

    • slurmjob
    • rocket files
    • CI jobs

Owner

  • Name: Suresoft
  • Login: TUBS-Suresoft
  • Kind: organization

Sustainable Research Software

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
type: software
authors:
  - family-names: Peters
    given-names: Sören
    orcid: https://orcid.org/0000-0001-5236-3776
  - family-names: Marcus
    given-names: Sven
    orcid: https://orcid.org/0000-0003-3689-2162
  - family-names: Linxweiler
    given-names: Jan
    orcid: https://orcid.org/0000-0002-2755-5087
title: "SURESOFT HPC workflow"
version: 0.1.0
doi: 10.5281/zenodo.7568959
license: MIT
repository-code: "https://git.rz.tu-bs.de/soe.peters/suresoft-hpc-workflow"
date-released: "2023-01-25"
references:
  - title: Singularity
    authors:
      - family-names: Kurtzer
        given-names: Gregory M.
    type: software
    license: BSD-3-Clause
    url: https://github.com/apptainer/singularity
    doi: 10.5281/zenodo.1310023
  - title: hpc-rocket
    authors:
      - family-names: Marcus
        given-names: Sven
        orcid: https://orcid.org/0000-0003-3689-2162
    type: software
    license: MIT
    url: https://github.com/SvenMarcus/hpc-rocket
    doi: 10.5281/zenodo.7355862
  - title: fieldcompare
    authors:
      - family-names: Gläser
        given-names: Dennis
        orcid: https://orcid.org/0000-0001-9646-881X
    type: software
    license: GPL-3.0-only
    url: https://gitlab.com/dglaeser/fieldcompare

GitHub Events

Total
Last Year

Dependencies

jobgeneration/requirements.txt pypi
  • Jinja2 ==3.1.2
  • matplotlib ==3.6.3
  • rich ==13.0.1