cato

Automatic source transformation to apply HPC frameworks with minimal user interaction

https://github.com/jsquar/cato

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    2 of 3 committers (66.7%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.3%) to scientific vocabulary

Keywords

code-transformation distributed-computing llvm llvm-pass mpi netcdf4 openmp parallel-computing
Last synced: 4 months ago · JSON representation ·

Repository

Automatic source transformation to apply HPC frameworks with minimal user interaction

Basic Info
  • Host: GitHub
  • Owner: JSquar
  • License: apache-2.0
  • Language: C++
  • Default Branch: master
  • Homepage:
  • Size: 446 KB
Statistics
  • Stars: 6
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
code-transformation distributed-computing llvm llvm-pass mpi netcdf4 openmp parallel-computing
Created over 5 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

Note: Currently the master branch does not build. Development is done on tschop-dev branch.

CATO

CATO (Compiler Assisted Source Transformation of OpenMP Kernels) uses LLVM and Clang to transform existing OpenMP code to MPI. This enables distributed code execution while keeping OpenMP's relatively low barrier of entry. The main focus lies on increasing the maximum problem size, which a scientific application can work on. Converting an intra-node problem into an inter-node problem makes it possible to overcome the limitation of memory of a single node.

Using CATO

Dependencies

| CATO | LLVM | MPICH | netcdf-c | |:----:|:------:|:-----:|:--------:| | 0.1 | 12.0.0 | 3.3.1 | x | | 0.2 | 13.0.0 | 3.3.1 | 4.8.1 |

It is important to pay attention to a right match of CATO and LLVM. Major LLVM releases tend to induce ABI breaking changes. Currently dependencies are installed from source using spack; the packages are then loaded, if initialise_environment.sh is being sourced. But using globally installations of the dependencies should also work fine. Please pay attention, dass LLVM dump calls are only available if the debug build from LLVM is used. Otherwise #define DEBUG_CATO_PASS 1 must be set to #define DEBUG_CATO_PASS 0 in src/cato/debug.h before building. An improved build script to automatise this process more is still under development.

Building the LLVM Pass

CATO is an LLVM pass, which is applied during the optimisation phase.

$ scripts/build_pass.sh

To make sure to remove old existing files, add --rebuild flag to build_pass.sh to delete old build directory (important if you switch LLVM versions).

Create modified binary

$ scripts/cexecute_pass.py inputfile.c -o inputfile_modified.x

Most simple way is to use the auxiliary script cexecute_pass.py. As an alternative the generation of the transformed code can also be done manually. This makes it easier to follow single steps. Those steps could be shortened but the long version is used here to get more intermediate results (useful for debugging).You need to pay attention, if your LLVM installation does use the new or legacy pass manager by default. At the moment CATO is still being refactored to use use the new pass manager together with LLVM 14. Therefore CATO needs to be used with the legacy pass manager.

Using the new LLVM Pass

  1. Create IR code from inputfile.c shell $ clang emit-llvm -S inputfile.c

  2. Apply CATO LLVM pass to get modified IR shell $ opt -load-pass-plugin=libCatoPass.so -passes=Cato inputfile.ll -S -o inputfile_modified.ll

  3. Create modified LLVM bytecode from IR code shell $ llvm-as inputfile_modified.ll -o inputfile_modified.bc

  4. Create final binary from modified LLVM bytecode shell mpicc -cc=clang -o inputfile_modified_binary.x inputfile_modified.ll libCatoRuntime.so

You need to adjust the path to libCatoPass.so and libCatoRuntime.so. The generated binary file can now be executed with mpiexec.

Using the legacy LLVM Pass

  1. Create IR code from inputfile.c shell $ clang emit-llvm -S inputfile.c

  2. Apply CATO LLVM pass to get modified IR shell $ opt -enable-new-pm=0 -load libCatoPass.so -Cato inputfile.ll -S -o inputfile_modified.ll

  3. Create modified LLVM bytecode from IR code shell $ llvm-as inputfile_modified.ll -o inputfile_modified.bc

  4. Create final binary from modified LLVM bytecode shell mpicc -cc=clang -o inputfile_modified_binary.x inputfile_modified.ll libCatoRuntime.so

You need to adjust the path to libCatoPass.so and libCatoRuntime.so. The generated binary file can now be executed with mpiexec.

Citing CATO

If you are referencing CATO in a publication, please cite the following paper:

J. Squar, T. Jammer, M. Blesel, M. Kuhn and T. Ludwig, "Compiler Assisted Source Transformation of OpenMP Kernels," 2020 19th International Symposium on Parallel and Distributed Computing (ISPDC), Warsaw, Poland, 2020, pp. 44-51, doi: 10.1109/ISPDC51135.2020.00016.

Owner

  • Name: Jannek Squar
  • Login: JSquar
  • Kind: user
  • Location: Hamburg
  • Company: @wr-hamburg

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: CATO
message: >-
  If you are referencing CATO in a publication, please cite
  the following paper:
type: software
authors:
  - given-names: Jannek
    family-names: Squar
    email: jannek.squar@uni-hamburg.de
    affiliation: Universität Hamburg
    orcid: 'https://orcid.org/0000-0001-6894-9210'
  - given-names: Tim
    family-names: Jammer
    email: tim.jammer@sc.tu-darmstadt.de
    affiliation: Technische Universität Darmstadt
  - given-names: Michael
    family-names: Blesel
    email: michael.blesel@ovgu.de
    affiliation: Otto von Guericke University Magdeburg
  - given-names: Michael
    family-names: Kuhn
    email: michael.kuhn@ovgu.de
    affiliation: Otto von Guericke University Magdeburg
    orcid: 'https://orcid.org/0000-0001-8167-8574'
  - given-names: Thomas
    family-names: Ludwig
    email: ludwig@dkrz.de
    affiliation: DKRZ
repository-code: 'https://github.com/JSquar/cato'
license: Apache-2.0
preferred-citation:
  type: conference-paper
  authors:
    - given-names: Jannek
      family-names: Squar
      email: jannek.squar@uni-hamburg.de
      affiliation: Universität Hamburg
      orcid: 'https://orcid.org/0000-0001-6894-9210'
    - given-names: Tim
      family-names: Jammer
      email: tim.jammer@sc.tu-darmstadt.de
      affiliation: Technische Universität Darmstadt
    - given-names: Michael
      family-names: Blesel
      email: michael.blesel@ovgu.de
      affiliation: Otto von Guericke University Magdeburg
    - given-names: Michael
      family-names: Kuhn
      email: michael.kuhn@ovgu.de
      affiliation: Otto von Guericke University Magdeburg
      orcid: 'https://orcid.org/0000-0001-8167-8574'
    - given-names: Thomas
      family-names: Ludwig
      email: ludwig@dkrz.de
      affiliation: DKRZ
  title: Compiler Assisted Source Transformation of OpenMP Kernels
  doi: 10.1109/ISPDC51135.2020.00016
  collection-title: >-
    2020 19th International Symposium on Parallel and Distributed Computing
    (ISPDC)
  conference:
    name: ISPDC 2020
    city: "\tWarsaw"
    country: PL
    date-start: '2020-06-05'
    date-end: '2020-06-08'

GitHub Events

Total
Last Year

Committers

Last synced: almost 2 years ago

All Time
  • Total Commits: 38
  • Total Committers: 3
  • Avg Commits per committer: 12.667
  • Development Distribution Score (DDS): 0.316
Past Year
  • Commits: 19
  • Committers: 3
  • Avg Commits per committer: 6.333
  • Development Distribution Score (DDS): 0.474
Top Committers
Name Email Commits
Jannek s****r@i****e 26
Jannek j****r@u****e 6
Jannek Squar g****b@j****u 6
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: almost 2 years ago

All Time
  • Total issues: 20
  • Total pull requests: 0
  • Average time to close issues: about 1 year
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 0.05
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 7
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • JSquar (20)
Pull Request Authors
Top Labels
Issue Labels
documentation (10) enhancement (6) netCDF (6) New component (2) good first issue (2) tests (1) Optional (1) bug (1)
Pull Request Labels