hardware_sampling

The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power draw.

https://github.com/sc-sgs/hardware_sampling

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.3%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power draw.

Basic Info
  • Host: GitHub
  • Owner: SC-SGS
  • License: mit
  • Language: C++
  • Default Branch: main
  • Homepage:
  • Size: 1.87 MB
Statistics
  • Stars: 18
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 7
Created over 1 year ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.md

hws - Hardware Sampling for CPUs and GPUs

The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power draw. It currently supports CPUs as well as GPUs from NVIDIA, AMD, and Intel.

Getting Started

Dependencies

General dependencies:

  • a C++17 capable compiler
  • {fmt} > 11.0.2 for string formatting (automatically build during the CMake configuration if it couldn't be found using the respective find_package call)
  • Pybind11 > v2.13.1 if Python bindings are enabled (automatically build during the CMake configuration if it couldn't be found using the respective find_package call)

Dependencies based on the hardware to sample:

  • if a CPU should be targeted: at least one of turbostat (may require root privileges), lscpu, or free and the subprocess.h library (automatically build during the CMake configuration if it couldn't be found using the respective find_package call)
  • if an NVIDIA GPU should be targeted: NVIDIA's Management Library NVML
  • if an AMD GPU should be targeted: AMD's ROCm SMI library rocm_smi_lib
  • if an Intel GPU should be targeted: Intel's Level Zero library

Building hws

To download the hardware sampling use:

bash git clone git@github.com:SC-SGS/hardware_sampling.git cd hardware_sampling

Building the library can be done using the normal CMake approach:

bash mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release [optional_options] .. cmake --build . -j

Optional CMake Options

The [optional_options] can be one or multiple of:

  • HWS_ENABLE_CPU_SAMPLING=ON|OFF|AUTO (default: AUTO):

    • ON: check whether CPU information can be sampled and fail if this is not the case
    • AUTO: check whether CPU information can be sampled but do not fail if this is not the case
    • OFF: do not check whether CPU information can be sampled
  • HWS_ENABLE_GPU_NVIDIA_SAMPLING=ON|OFF|AUTO (default: AUTO):

    • ON: check whether NVIDIA GPU information can be sampled and fail if this is not the case
    • AUTO: check whether NVIDIA GPU information can be sampled but do not fail if this is not the case
    • OFF: do not check whether NVIDIA GPU information can be sampled
  • HWS_ENABLE_GPU_AMD_SAMPLING=ON|OFF|AUTO (default: AUTO):

    • ON: check whether AMD GPU information can be sampled and fail if this is not the case
    • AUTO: check whether AMD GPU information can be sampled but do not fail if this is not the case
    • OFF: do not check whether AMD GPU information can be sampled
  • HWS_ENABLE_GPU_INTEL_SAMPLING=ON|OFF|AUTO (default: AUTO):

    • ON: check whether Intel GPU information can be sampled and fail if this is not the case
    • AUTO: check whether Intel GPU information can be sampled but do not fail if this is not the case
    • OFF: do not check whether Intel GPU information can be sampled
  • HWS_ENABLE_ERROR_CHECKS=ON|OFF (default: OFF): enable sanity checks during hardware sampling, may be problematic with smaller sample intervals

  • HWS_SAMPLING_INTERVAL=100ms (default: 100ms): set the sampling interval in milliseconds

  • HWS_ENABLE_PYTHON_BINDINGS=ON|OFF (default: ON): enable Python bindings

Installing via CMake

The library supports the install target:

bash cmake --install . --prefix "/home/myuser/installdir"

Afterward, the necessary exports should be performed:

bash export CMAKE_PREFIX_PATH=${CMAKE_INSTALL_PREFIX}/share/hws/cmake:${CMAKE_PREFIX_PATH} export LD_LIBRARY_PATH=${CMAKE_INSTALL_PREFIX}/lib:${CMAKE_INSTALL_PREFIX}/lib64:${LD_LIBRARY_PATH} export CPLUS_INCLUDE_PATH=${CMAKE_INSTALL_PREFIX}/include:${CPLUS_INCLUDE_PATH} export PYTHONPATH=${CMAKE_INSTALL_PREFIX}/lib:${CMAKE_INSTALL_PREFIX}/lib64:${PYTHONPATH}

Note: when using Intel GPUs, the CMAKE_MODULE_PATH should be updated to point to our cmake directory containing the Findlevel_zero.cmake file and export ZES_ENABLE_SYSMAN=1 should be set.

Installing via pip

The library is also available via pip:

bash pip install hardware-sampling

This pip install behaves as if no additional CMake options were provided. This means that only the hardware is supported for which the respective vendor libraries was available at the point of the pip install hardware-sampling invocation.

Available samples

The sampling type fixed denotes samples that are gathered once per hardware samples like maximum clock frequencies or temperatures or the total available memory. The sampling type sampled denotes samples that are gathered during the whole hardware sampling process like the current clock frequencies, temperatures, or memory consumption.

General samples

| sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs | |:--------------------|:-----------:|:-----------:|:-----------:|:---------:|:-------------:| | architecture | fixed | str | str | str | - | | byteorder | fixed | str | str (fix) | str (fix) | str (fix) | | numcores | fixed | int | int | - | - | | numthreads | fixed | int | - | - | - | | threadspercore | fixed | int | - | - | - | | corespersocket | fixed | int | - | - | - | | numsockets | fixed | int | - | - | - | | numanodes | fixed | int | - | - | - | | vendorid | fixed | str | str (fix) | str | str (PCIe ID) | | name | fixed | str | str | str | str | | flags | fixed | list of str | - | - | list of str | | persistencemode | fixed | - | bool | - | - | | standbymode | fixed | - | - | - | str | | numthreadspereu | fixed | - | - | - | int | | eusimdwidth | fixed | - | - | - | int | | computeutilization | sampled | % | % | % | - | | memoryutilization | sampled | - | % | % | - | | ipc | sampled | float | - | - | - | | irq | sampled | int | - | - | - | | smi | sampled | int | - | - | - | | poll | sampled | int | - | - | - | | pollpercent | sampled | % | - | - | - | | performance_level | sampled | - | int | str | - |

clock-related samples

| sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs | |:-----------------------------------|:-----------:|:----:|:-----------:|:-----------:|:-----------:| | autoboostedclockenabled | fixed | bool | bool | - | - | | clockfrequencymin | fixed | MHz | MHz | MHz | MHz | | clockfrequencymax | fixed | MHz | MHz | MHz | MHz | | memoryclockfrequencymin | fixed | - | MHz | MHz | MHz | | memoryclockfrequencymax | fixed | - | MHz | MHz | MHz | | socketclockfrequencymin | fixed | - | - | MHz | - | | socketclockfrequencymin | fixed | - | - | MHz | - | | smclockfrequencymax | fixed | - | MHz | - | - | | availableclockfrequencies | fixed | - | map of MHz | list of MHz | list of MHz | | availablememoryclockfrequencies | fixed | - | list of MHz | list of MHz | list of MHz | | clockfrequency | sampled | MHz | MHz | MHz | MHz | | averagenonidleclockfrequency | sampled | MHz | - | - | - | | timestampcounter | sampled | MHz | - | - | - | | memoryclockfrequency | sampled | - | MHz | MHz | MHz | | socketclockfrequency | sampled | - | - | MHz | - | | smclockfrequency | sampled | - | MHz | - | - | | overdrivelevel | sampled | - | - | % | - | | memoryoverdrivelevel | sampled | - | - | % | - | | throttlereason | sampled | - | bitmask | - | bitmask | | throttlereasonstring | sampled | - | str | - | str | | memorythrottlereason | sampled | - | - | - | bitmask | | memorythrottlereasonstring | sampled | - | - | - | str | | autoboostedclock | sampled | - | bool | - | - | | frequencylimittdp | sampled | - | - | - | MHz | | memoryfrequencylimittdp | sampled | - | - | - | MHz |

power-related samples

| sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs | |:-------------------------------|:-----------:|:---------------------------------:|:-----------:|:--------------------------------------------------------------------------------------:|:----------------------------------------------------:| | powermanagementlimit | fixed | - | W | W | - | | powerenforcedlimit | fixed | - | W | W | W | | powermeasurementtype | fixed | str (fix) | str | str | str | | powermanagementmode | fixed | - | bool | - | bool | | availablepowerprofiles | fixed | - | list of int | list of str | - | | powerusage | sampled | W | W | W | W
(calculated via power
totalenergyconsumption) | | corewatt | sampled | W | - | - | - | | dramwatt | sampled | W | - | - | - | | packageraplthrottling | sampled | % | - | - | - | | dramraplthrottling | sampled | % | - | - | - | | powertotalenergyconsumption | sampled | J
(calculated via power
usage) | J | J
(calculated via powerusage if
power
totalenergyconsumption isn't available) | J | | power_profile | sampled | - | int | str | - |

memory-related samples

| sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs | |:----------------------------|:-----------:|:----:|:-----------:|:--------:|:------------------------------:| | cachesizeL1d | fixed | str | - | - | - | | cachesizeL1i | fixed | str | - | - | - | | cachesizeL2 | fixed | str | - | - | - | | cachesizeL3 | fixed | str | - | - | - | | memorytotal | fixed | B | B | B | B
(map of memory modules) | | visible
memorytotal | fixed | - | - | B | B
(map of memory modules) | | swap
memorytotal | fixed | B | - | - | - | | numpcielanesmin | fixed | - | - | int | - | | numpcielanesmax | fixed | - | int | int | int | | pcielinkgenerationmax | fixed | - | int | - | int | | pcielinkspeedmax | fixed | - | MBPS | - | MBPS | | pcielinktransferratemin | fixed | - | - | MT/s | - | | pcielinktransferratemax | fixed | - | - | MT/s | - | | memorybuswidth | fixed | - | Bit | - | Bit
(map of memory modules) | | memory
numchannels | fixed | - | - | - | int
(map of memory modules) | | memory
used | sampled | B | B | B | B
(map of memory modules) | | memoryfree | sampled | B | B | B | B
(map of memory modules) | | swap
memoryused | sampled | B | - | - | - | | swapmemoryfree | sampled | B | - | - | - | | numpcielanes | sampled | - | int | int | int | | pcielinkgeneration | sampled | - | int | - | int | | pcielinkspeed | sampled | - | MBPS | - | MBPS | | pcielinktransferrate | sampled | - | - | T/s | - |

temperature-related samples

| sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs | |:------------------------|:-----------:|:----:|:-----------:|:--------:|:----------:| | numfans | fixed | - | int | int | int | | fanspeedmin | fixed | - | % | - | - | | fanspeedmax | fixed | - | % | RPM | RPM | | temperaturemin | fixed | - | - | °C | - | | temperaturemax | fixed | - | °C | °C | °C | | memorytemperaturemin | fixed | - | - | °C | - | | memorytemperaturemax | fixed | - | °C | °C | °C | | hotspottemperaturemin | fixed | - | - | °C | - | | hotspottemperaturemax | fixed | - | - | °C | - | | hbm0temperaturemin | fixed | - | - | °C | - | | hbm0temperaturemax | fixed | - | - | °C | - | | hbm1temperaturemin | fixed | - | - | °C | - | | hbm1temperaturemax | fixed | - | - | °C | - | | hbm2temperaturemin | fixed | - | - | °C | - | | hbm2temperaturemax | fixed | - | - | °C | - | | hbm3temperaturemin | fixed | - | - | °C | - | | hbm3temperaturemax | fixed | - | - | °C | - | | globaltemperaturemax | fixed | - | - | °C | °C | | fanspeedpercentage | sampled | - | % | % | % | | temperature | sampled | °C | °C | °C | °C | | memorytemperature | sampled | - | - | °C | °C | | hotspottemperature | sampled | - | - | °C | - | | hbm0temperature | sampled | - | - | °C | - | | hbm1temperature | sampled | - | - | °C | - | | hbm2temperature | sampled | - | - | °C | - | | hbm3temperature | sampled | - | - | °C | - | | globaltemperature | sampled | - | - | - | °C | | psutemperature | sampled | - | - | - | °C | | coretemperature | sampled | °C | - | - | - | | corethrottlepercent | sampled | % | - | - | - |

gfx-related (iGPU) samples

| sample | sample type | CPUs | |:--------------------------|:-----------:|:----:| | gfxrenderstatepercent | sampled | % | | gfxfrequency | sampled | MHz | | averagegfxfrequency | sampled | MHz | | gfxstatec0percent | sampled | % | | cpuworksforgpupercent | sampled | % | | gfxwatt | sampled | W |

"idle states"-related samples

| sample | sample type | CPUs | |:-------------------------------------|:-----------:|:-------------:| | idlestates | fixed | map of values | | allcpusstatec0percent | sampled | % | | anycpustatec0percent | sampled | % | | lowpoweridlestatepercent | sampled | % | | systemlowpoweridlestatepercent | sampled | % | | packagelowpoweridlestate_percent | sampled | % |

Example Python usage

```python import HardwareSampling as hws import numpy as np import matplotlib.pyplot as plt import matplotlib.dates as mdates import datetime

sampler = hws.CpuHardwareSampler()

could also be, e.g.,

sampler = hws.GpuNvidiaHardwareSampler()

sampler.start()

sampler.add_event("init") A = np.random.rand(2 ** 14, 2 ** 14) B = np.random.rand(2 ** 14, 2 ** 14)

sampler.add_event("matmul") C = A @ B

sampler.stop() sampler.dump_yaml("track.yaml")

plot the results

timepoints = sampler.relativetime_points()

plt.plot(timepoints, sampler.clocksamples().getclockfrequency(), label="average") plt.plot(timepoints, sampler.clocksamples().getaveragenonidleclock_frequency(), label="average non-idle")

axes = plt.gcf().axes[0] xbounds = axes.getxlim() for event in sampler.getrelativeevents()[1:-1]: axes.axvline(x=event.relativetimepoint, color='r') axes.annotate(text=event.name, xy=(((event.relativetimepoint - xbounds[0]) / (xbounds[1] - x_bounds[0])), 1.025), xycoords='axes fraction', rotation=270)

plt.xlabel("runtime [ms]") plt.ylabel("clock frequency [MHz]") plt.legend() plt.show() ```

example frequency plot

License

The hws library is distributed under the MIT license.

Owner

  • Name: Scientific Computing (SC) and Simulation of Large Systems (SGS) @ University of Stuttgart
  • Login: SC-SGS
  • Kind: organization

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: hws - Hardware Sampling for GPUs and CPUs
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Marcel
    family-names: Breyer
    email: Marcel.Breyer@ipvs.uni-stuttgart.de
    affiliation: University of Stuttgart
    orcid: 'https://orcid.org/0000-0003-3574-0650'
  - given-names: Alexander
    family-names: Van Craen
    email: Alexander.Van-Craen@ipvs.uni-stuttgart.de
    affiliation: University of Stuttgart
    orcid: 'https://orcid.org/0000-0002-3336-7226'
  - given-names: Dirk
    family-names: Pflüger
    email: Dirk.Pflueger@ipvs.uni-stuttgart.de
    orcid: 'https://orcid.org/0000-0002-4360-0212'
    affiliation: University of Stuttgart
repository-code: 'https://github.com/SC-SGS/hardware_sampling'
license: MIT
version: v1.1.1
date-released: '2025-04-29'

GitHub Events

Total
  • Create event: 4
  • Release event: 2
  • Issues event: 1
  • Watch event: 12
  • Issue comment event: 3
  • Push event: 24
  • Pull request event: 1
Last Year
  • Create event: 4
  • Release event: 2
  • Issues event: 1
  • Watch event: 12
  • Issue comment event: 3
  • Push event: 24
  • Pull request event: 1

Dependencies

.github/workflows/documentation.yml actions
  • actions/checkout v4.2.0 composite
  • peaceiris/actions-gh-pages v4 composite