ompt-printf
Science Score: 52.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
✓Institutional organization owner
Organization fzj-jsc has institutional domain (www.fz-juelich.de) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: FZJ-JSC
- License: other
- Language: C++
- Default Branch: main
- Size: 142 KB
Statistics
- Stars: 2
- Watchers: 3
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
OpenMP Tools Interface Example Tool: ompt-printf
Description
This tool is a simple example of how to use the OpenMP Tools Interface (OMPT) to collect information about the execution of an OpenMP program.
The OpenMP Tools Interface is part of the OpenMP standard since version 5.0 and allows programs like performance and correctness tools to collect information such as parallel regions, tasking, worksharing, offloading and more. More information can be found in the documentation.
This tool implements the callback-based part of the OMPT interface. The tool
is registered through the standardised ompt_start_tool function:
c++
extern "C" ompt_start_tool_result_t *
ompt_start_tool( unsigned int omp_version,
const char* runtime_version )
Here, we decide between three different modes of operation, which can
be controlled via the environment variable OMPT_PRINTF_MODE:
| Mode | Description | |------|-----------------------------------------| | 0 | Disable the tool entirely | | 1 | Enable tool, but print no information | | 2 | Print all events, but without arguments | | 3 | Print all events with arguments |
These modes are implemented via C++ templates. This should keep the overhead of the tool as low as possible when choosing between the different modes. By default, the tool is built with mode 2.
Requirements
These are the requirements to build the tool with the available build system. The library can also be built manually, but is not covered here.
- CMake 3.10 or newer
- A C++17 compliant compiler
- An OpenMP runtime supporting the OMPT interface (e.g. LLVM/Clang, oneAPI, NVHPC, ...)
- CMake will only check the presence of the
omp-tools.hheader file. Actual runtime support is not checked.
Build the library
The library can easily be built with CMake. On a checked-out or downloaded copy, simply run the following commands:
bash
$ cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DCOMPILER_TOOLCHAIN=[your-vendor]
$ cmake --build build
The COMPILER_TOOLCHAIN variable is used in favor of setting the compiler
manually via CMAKE_C_COMPILER and CMAKE_CXX_COMPILER, to ensure that
additional flags some compilers (like NVHPC) require are set correctly.
find_package( OpenMP ) is not sufficient in this case.
The following vendors are supported and set these flags:
| Vendor | C Compiler | C++ Compiler | CFLAGS | CXXFLAGS | LDFLAGS |
|----------|------------|--------------|-------------|-------------|-------------|
| GNU | gcc | g++ | -fopenmp | -fopenmp | -fopenmp |
| Clang | clang | clang++ | -fopenmp | -fopenmp | -fopenmp |
| NVHPC | nvc | nvc++ | -mp=ompt | -mp=ompt | -mp=ompt |
| AMDClang | amdclang | amdclang++ | -fopenmp | -fopenmp | -fopenmp |
| AOCC | clang | clang++ | -fopenmp | -fopenmp | -fopenmp |
| Intel | icc | icpc | -fopenmp | -fopenmp | -fopenmp |
| oneAPI | icx | icpx | -fiopenmp | -fiopenmp | -fiopenmp |
| Cray | cc | CC | -fopenmp | -fopenmp | -fopenmp |
After building the library, the tool can be used with the environment variable
OMP_TOOL_LIBRARIES. Please note that NVHPC requires adding -mp=ompt when building
a program since this affects code generation and the linked OpenMP library.
Controlling the tool
ompt-printf implements the ompt_callback_control_tool callback using the default commands
specified in the OpenMP standard. An application can use omp_control_tool to control the
device tracing interface with the following options:
| Command | Modifier | Effect |
|--------------------------|-------------------------------------|---------------------------------------------------|
| omp_control_tool_start | (ignored) | Starts device tracing on all initialized devices. |
| omp_control_tool_flush | (ignored) | Flushes all device tracing buffers. |
| omp_control_tool_pause | true to pause, false to unpause | Pauses device tracing on all initialized devices. |
| omp_control_tool_end | (ignored) | Stops device tracing on all initialized devices. |
These options will only affect devices which have been fully initialized and support the device tracing interface. The remainder of the tool will not be affected by these commands.
Example
Here is an example on building a program with the tool enabled (Clang 17.0.6, Arch Linux):
bash
$ cat my-openmp-example.c
int main( void )
{
#pragma omp parallel
{}
return 0;
}
$ export OMP_TOOL_LIBRARIES=$(pwd)/build/src/libompt-printf.so
$ clang -fopenmp my-openmp-example.c -o my-openmp-example
$ OMP_NUM_THREADS=2 ./my-openmp-example
[-1][ompt_start_tool] omp_version = 201611 | runtime_version = LLVM OMP version: 5.0.20140926
[-1][tool_initialize] lookup = 0x77bdfa723820 | initial_device_num = 0 | tool_data = 0x77bdfa666428
[-1][tool_initialize] thread_begin = always
[-1][tool_initialize] thread_end = always
[-1][tool_initialize] parallel_begin = always
[-1][tool_initialize] parallel_end = always
[-1][tool_initialize] task_create = always
[-1][tool_initialize] task_schedule = always
[-1][tool_initialize] implicit_task = always
[-1][tool_initialize] sync_region_wait = always
[-1][tool_initialize] mutex_released = always
[-1][tool_initialize] dependences = always
[-1][tool_initialize] task_dependence = always
[-1][tool_initialize] work = always
[-1][tool_initialize] masked = always
[-1][tool_initialize] sync_region = always
[-1][tool_initialize] lock_init = always
[-1][tool_initialize] lock_destroy = always
[-1][tool_initialize] mutex_acquire = always
[-1][tool_initialize] mutex_acquired = always
[-1][tool_initialize] nest_lock = always
[-1][tool_initialize] flush = always
[-1][tool_initialize] cancel = always
[-1][tool_initialize] reduction = always
[-1][tool_initialize] dispatch = always
[-1][tool_initialize] device_initialize = always
[-1][tool_initialize] device_finalize = always
[-1][tool_initialize] device_load = always
[-1][tool_initialize] device_unload = never
[-1][tool_initialize] target_emi = always
[-1][tool_initialize] target_map_emi = never
[-1][tool_initialize] target_data_op_emi = always
[-1][tool_initialize] target_submit_emi = always
[0][callback_thread_begin] thread_type = initial | thread_data = 0x624479fb3808
[0][callback_implicit_task] endpoint = begin | parallel_data->value = 0 (0x624479faef20) | task_data->value = 555000001 (0x624479faf840) | actual_parallelism = 1 | index = 1 | flags = initial
[0][callback_parallel_begin] encountering_task_data->value = 555000001 (0x624479faf840) | encountering_task_frame = 0x624479faf828 | parallel_data->value = 666000001 (0x7ffe86f3cb00) | requested_parallelism = 2 | flags = runtime_team | codeptr_ra = 0x624479026166
[0][callback_implicit_task] endpoint = begin | parallel_data->value = 666000001 (0x624479fafb20) | task_data->value = 555000002 (0x624479fb0f00) | actual_parallelism = 2 | index = 0 | flags = implicit
[0][callback_sync_region] kind = barrier_implicit | endpoint = begin | parallel_data->value = 666000001 (0x624479fafb20) | task_data->value = 555000002 (0x624479fb0f00) | codeptr_ra = 0x624479026166
[0][callback_sync_region_wait] kind = barrier_implicit | endpoint = begin | parallel_data->value = 666000001 (0x624479fafb20) | task_data->value = 555000002 (0x624479fb0f00) | codeptr_ra = 0x624479026166
[1][callback_thread_begin] thread_type = worker | thread_data = 0x624479fbba88
[1][callback_implicit_task] endpoint = begin | parallel_data->value = 666000001 (0x624479fafb20) | task_data->value = 555000003 (0x624479fb1040) | actual_parallelism = 2 | index = 1 | flags = implicit
[1][callback_sync_region] kind = barrier_implicit | endpoint = begin | parallel_data->value = 666000001 (0x624479fafb20) | task_data->value = 555000003 (0x624479fb1040) | codeptr_ra = (nil)
[1][callback_sync_region_wait] kind = barrier_implicit | endpoint = begin | parallel_data->value = 666000001 (0x624479fafb20) | task_data->value = 555000003 (0x624479fb1040) | codeptr_ra = (nil)
[0][callback_sync_region_wait] kind = barrier_implicit | endpoint = end | parallel_data->value = 666666666 ((nil)) | task_data->value = 555000002 (0x624479fb0f00) | codeptr_ra = 0x624479026166
[0][callback_sync_region] kind = barrier_implicit | endpoint = end | parallel_data->value = 666666666 ((nil)) | task_data->value = 555000002 (0x624479fb0f00) | codeptr_ra = 0x624479026166
[0][callback_implicit_task] endpoint = end | parallel_data->value = 666666666 ((nil)) | task_data->value = 555000002 (0x624479fb0f00) | actual_parallelism = 2 | index = 0 | flags = implicit
[0][callback_parallel_end] parallel_data->value = 666000001 (0x624479fafb20) | encountering_task_data->value = 555000001 (0x624479faf840) | flags = runtime_team | codeptr_ra = 0x624479026166
[0][callback_implicit_task] endpoint = end | parallel_data->value = 0 (0x624479faef20) | task_data->value = 555000001 (0x624479faf840) | actual_parallelism = 0 | index = 1 | flags = initial
[0][callback_thread_end] thread_data = 0x624479fb3808
[1][callback_sync_region_wait] kind = barrier_implicit | endpoint = end | parallel_data->value = 666666666 ((nil)) | task_data->value = 555000003 (0x624479fbba90) | codeptr_ra = (nil)
[1][callback_sync_region] kind = barrier_implicit | endpoint = end | parallel_data->value = 666666666 ((nil)) | task_data->value = 555000003 (0x624479fbba90) | codeptr_ra = (nil)
[1][callback_implicit_task] endpoint = end | parallel_data->value = 666666666 ((nil)) | task_data->value = 555000003 (0x624479fbba90) | actual_parallelism = 0 | index = 1 | flags = implicit
[1][callback_thread_end] thread_data = 0x624479fbba88
[0][tool_finalize] tool_data = 0x77bdfa666428
Owner
- Name: Jülich Supercomputing Centre
- Login: FZJ-JSC
- Kind: organization
- Location: Germany
- Website: https://www.fz-juelich.de/en/ias/jsc
- Twitter: fzj_jsc
- Repositories: 29
- Profile: https://github.com/FZJ-JSC
Jülich Supercomputing Centre provides HPC resources and expertise. Part of Forschungszentrum Jülich.
Citation (CITATION.cff)
# What is a CITATION.cff file?
# See https://citation-file-format.github.io/
cff-version: 1.2.0
title: ompt-printf
abstract: >-
In version 5.0 of the OpenMP specification, the OpenMP Tools Interface
(OMPT) was introduced, providing means to collect precise information
about the application's use of OpenMP directives and lock routines.
Although provided with a detailed specification, understanding and
correctly handling the CPU execution model event sequence dispatched
from various vendor's runtimes requires detailed analysis of events,
their parameters and executing threads. To facilitate this analysis,
we developed ompt-printf, an OMPT tool that allows for dumping
execution model events and corresponding metadata for post-mortem
inspection.
message: >-
If you use this software, please cite it using the metadata from
this file.
type: software
repository-code: 'https://github.com/FZJ-JSC/ompt-printf'
keywords:
- OpenMP Tools Interface
- Performance measurement
- OpenMP
license: BSD-3-Clause
contact:
- email: support@score-p.org
# Authors of ompt-printf, in chronological order:
authors:
- family-names: Reuter
given-names: Jan André
email: j.reuter@fz-juelich.de
orcid: https://orcid.org/0000-0002-1219-0310
- family-names: Feld
given-names: Christian
email: c.feld@fz-juelich.de
orcid: https://orcid.org/0000-0001-7685-3497
GitHub Events
Total
- Watch event: 1
- Delete event: 2
- Push event: 3
- Pull request event: 12
- Create event: 1
Last Year
- Watch event: 1
- Delete event: 2
- Push event: 3
- Pull request event: 12
- Create event: 1
Dependencies
- ./.github/actions/build-and-test-ompt-printf * composite
- actions/checkout v4 composite