lo2s
Linux OTF2 Sampling - A Lightweight Node-Level Performance Monitoring Tool
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 12 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary
Keywords
Repository
Linux OTF2 Sampling - A Lightweight Node-Level Performance Monitoring Tool
Basic Info
- Host: GitHub
- Owner: tud-zih-energy
- License: gpl-3.0
- Language: C++
- Default Branch: master
- Homepage: https://tu-dresden.de/zih/forschung/projekte/lo2s?set_language=en
- Size: 1.8 MB
Statistics
- Stars: 50
- Watchers: 6
- Forks: 13
- Open Issues: 21
- Releases: 15
Topics
Metadata Files
README.md
lo2s is a lightweight node-level performance monitoring tool used to analyze applications, the operating system and hardware.
Lightweight Node-Level Performance Monitoring
lo2s creates parallel OTF2 traces with a focus on both application and system view. The traces can contain any of the following information:
- From running threads
- Calling context samples based on instruction overflows
- The calling context samples are annotated with the disassembled assembler instruction string
- The framepointer-based call-path for each calling context sample
- Per-thread performance counter readings
- Which thread was scheduled on which CPU at what time
- Information about executed OpenMP constructs
- Accelerator activity events from NVidia and AMD GPUs as well as NEC SX-Aurora Vector Engines
- Application level I/O activity
- From the system
- Metrics from tracepoints (e.g. the selected C-state or P-state)
- The node-level system tree (cpus (HW-threads), cores, packages)
- CPU power measurements (x86_energy)
- Microarchitecture specific metrics (x86_adapt, per package or per core)
- Hardware sensors using lm_sensors
- Arbitrary metrics through plugins (Score-P compatible)
- Syscall activity
- Block layer I/O activity
In general lo2s operates either in process monitoring or system monitoring mode.
With process monitoring, all information is grouped by each thread of a monitored process group - it shows you on which CPU is each monitored thread running.
lo2s either acts as a prefix command to run the process (and also tracks its children), or lo2s attaches to a running process.
In the system monitoring mode, information is grouped by logical CPU - it shows you which thread was running on a given CPU. Metrics are also shown per CPU.
In both modes, system-level metrics (e.g. tracepoints), are always grouped by their respective system hardware component.
Build Requirements
- Linux (>= 4.3)1
- OTF2 (>= 3.1)
- elfutils (specifically libelf and libdw)
- CMake (>= 3.11)
- A C++ Compiler with C++17 support and the std::filesystem library (GCC > 7, Clang > 5)
1: Older kernels can work as the required features are oftentimes backported. Otherwise lo2s 1.7.0 can be used, which is the newest lo2s version with support for kernels as old as even 2.6.32.
Optional Build Dependencies
- x86_adapt for mircorarchitecture specific metrics
- x86_energy for CPU power metrics
- radare2 (>= 5.8.0) for disassembled instruction strings
- lm-sensors for sensor readings
- libaudit to resolve syscall names, otherwise only syscall nrs can be used in syscall tracing
- pod2man to generate the man pages (typically distributed as part of
perl) gzipto compress the man pages- libbpf and bpftool to enable POSIX I/O recording
- libpfm to support the event name resolution through it
- CUDA to record NVidia GPU Activity
- rocprofiler-sdk to record AMD GPU activity
- libveosinfo to record NEC SX-Aurora activity
- libdebuginfod to download DWARF debug information for recorded applications on-the-fly
- OpenMP to record OpenMP activity
Runtime Requirements
kernel.perf_event_paranoidshould be less than or equal to1for process monitoring mode and less than or equal to0in system monitoring mode. A value of-1will give the most features for non-root performance recording, such as tracepoints and block I/O, at the cost of some security. Modify as follows:
sudo sysctl kernel.perf_event_paranoid=1
- Tracepoints, block I/O and syscalls require access to debugfs. Grant permissions at your own discretion.
sudo mount -t debugfs none /sys/kernel/debug
Installation
- It is recommended to create an empty build directory anywhere.
cmake /path/to/lo2s- Configure cmake as usual, e.g. with
ccmake . makemake install
Usage
To monitor a given application in process monitoring execute
lo2s -- ./a.out --app-args
To monitor all activity on a system run
lo2s -a(stop the recording with ctrl+c)
For a full documentation of options see the manpage.
Usage with MPI
You can record simple traces from MPI programs, but lo2s does not record MPI communication.
To create fully-featured MPI-aware traces, use Score-P.
lo2s mpirun ./a.outCreate one trace of mpirun, useful if mpirun is used locally on one node.mpirun lo2s ./a.outCreates a separate trace for each process.
See man lo2s or lo2s --help for a full listing of options and usage.
Quirks
The perf_event_open kernel infrastructure changed significantly over time.
Therefore, it is already hard to just keep track which kernel version introduced which new feature.
Combine that with the abundance of backports of particular features by different distributors, and you end with a mess of options.
In the effort to keep compatible with older kernels and some architectures that lack hardware breakpoint support, several quirks have been added to lo2s:
- The initial time synchronization between lo2s and the kernel-space perf is done with a hardware breakpoint. If your kernel or processor architecture doesn't support that, you can use a fallback using the CMake option
USE_HW_BREAKPOINT_COMPAT. - If you get the following error message:
event 'ref-cycles' is not available as a metric leader!, you can fallback to the bus-cycles metric as leader using the lo2s command-lind argument--metric-leader bus-cycles.
Working with traces
Traces can be visualized with Vampir. You can use OTF2 or any of its tools. Native interfaces are available for C and Python
Acknowledgements
This work is supported in part by the German Research Foundation (DFG) within the CRC 912 - HAEC and the german National High Performance Computing (NHR@TUD).
Primary Reference
A description and use cases can be found in the following paper. Please cite this if you use lo2s for scientific work.
Thomas Ilsche, Robert Schöne, Mario Bielert, Andreas Gocht and Daniel Hackenberg. lo2s – Multi-Core System and Application Performance Analysis for Linux 📕 In: Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications (HPCMASPA). 2017. DOI: 10.1109/CLUSTER.2017.116
Additional References
Thomas Ilsche, Marcus Hähnel, Robert Schöne, Mario Bielert and Daniel Hackenberg: Powernightmares: The Challenge of Efficiently Using Sleep States on Multi-Core Systems 📕 In: 5th Workshop on Runtime and Operating Systems for the Many-core Era (ROME). 2017, DOI: 10.1007/978-3-319-75178-8_50
Thomas Ilsche, Robert Schöne, Philipp Joram, Mario Bielert and Andreas Gocht: System Monitoring with lo2s: Power and Runtime Impact of C-State Transitions 📕 In: 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), DOI: 10.1109/IPDPSW.2018.00114
Thomas Ilsche, Mario Bielert, Christian von Elm: Bridging the Gap between Application Performance Analysis and System Monitoring 📕 In: 2022 IEEE International Conference on Cluster Computing (CLUSTER), DOI: 10.1109/CLUSTER51413.2022.00080
Name
The name lo2s is an acronym for Linux OTF2 Sampling
Owner
- Name: tud-zih-energy
- Login: tud-zih-energy
- Kind: organization
- Repositories: 25
- Profile: https://github.com/tud-zih-energy
Citation (CITATION.cff)
cff-version: 1.2.0
message: "Please cite this paper if you use lo2s for scientific work"
authors:
- family-names: "Bielert"
given-names: "Mario"
- family-names: "von Elm"
given-names: "Christian"
- family-names: "Ilsche"
given-names: "Thomas"
- family-names: "Schöne"
given-names: "Robert"
title: "lo2s - Lightweight Node-Level Performance Monitoring"
version: 1.7.0
date-released: 2023-01-18
url: "https://github.com/tud-zih-energy/lo2s"
preferred-citation:
type: article
title: "lo2s — Multi-core System and Application Performance Analysis for Linux"
authors:
- family-names: "Ilsche"
given-names: "Thomas"
- family-names: "Schöne"
given-names: "Robert"
- family-names: "Bielert"
given-names: "Mario"
- family-names: "Gocht"
given-names: "Andreas"
- family-names: "Hackenberg"
given-names: "Daniel"
doi: "10.1109/CLUSTER.2017.116"
journal: "Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications (HPCMASPA). 2017"
month: 9
year: 2017
GitHub Events
Total
- Issues event: 28
- Watch event: 3
- Delete event: 8
- Issue comment event: 18
- Push event: 137
- Pull request review event: 24
- Pull request review comment event: 27
- Pull request event: 42
- Create event: 20
Last Year
- Issues event: 28
- Watch event: 3
- Delete event: 8
- Issue comment event: 18
- Push event: 137
- Pull request review event: 24
- Pull request review comment event: 27
- Pull request event: 42
- Create event: 20
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 20
- Total pull requests: 19
- Average time to close issues: over 1 year
- Average time to close pull requests: about 1 month
- Total issue authors: 3
- Total pull request authors: 1
- Average comments per issue: 1.4
- Average comments per pull request: 0.16
- Merged pull requests: 16
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 9
- Pull requests: 18
- Average time to close issues: 17 days
- Average time to close pull requests: 25 days
- Issue authors: 3
- Pull request authors: 1
- Average comments per issue: 0.44
- Average comments per pull request: 0.17
- Merged pull requests: 15
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- cvonelm (23)
- tilsche (7)
- bmario (3)
- maximilian-tech (2)
- rschoene (1)
- Flamefire (1)
Pull Request Authors
- cvonelm (41)
- teto519f (4)
- tilsche (1)
- bmario (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/upload-artifact v2 composite