hardware_sampling
The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power draw.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.3%) to scientific vocabulary
Repository
The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power draw.
Basic Info
Statistics
- Stars: 18
- Watchers: 2
- Forks: 1
- Open Issues: 0
- Releases: 7
Metadata Files
README.md
hws - Hardware Sampling for CPUs and GPUs
The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power draw. It currently supports CPUs as well as GPUs from NVIDIA, AMD, and Intel.
Getting Started
Dependencies
General dependencies:
- a C++17 capable compiler
- {fmt} > 11.0.2 for string formatting (automatically build during the CMake
configuration if it couldn't be found using the respective
find_packagecall) - Pybind11 > v2.13.1 if Python bindings are enabled (automatically build during
the CMake configuration if it couldn't be found using the respective
find_packagecall)
Dependencies based on the hardware to sample:
- if a CPU should be targeted: at least one of
turbostat(may require root privileges),lscpu, orfreeand thesubprocess.hlibrary (automatically build during the CMake configuration if it couldn't be found using the respectivefind_packagecall) - if an NVIDIA GPU should be targeted: NVIDIA's Management Library
NVML - if an AMD GPU should be targeted: AMD's ROCm SMI library
rocm_smi_lib - if an Intel GPU should be targeted: Intel's
Level Zero library
Building hws
To download the hardware sampling use:
bash
git clone git@github.com:SC-SGS/hardware_sampling.git
cd hardware_sampling
Building the library can be done using the normal CMake approach:
bash
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release [optional_options] ..
cmake --build . -j
Optional CMake Options
The [optional_options] can be one or multiple of:
HWS_ENABLE_CPU_SAMPLING=ON|OFF|AUTO(default:AUTO):ON: check whether CPU information can be sampled and fail if this is not the caseAUTO: check whether CPU information can be sampled but do not fail if this is not the caseOFF: do not check whether CPU information can be sampled
HWS_ENABLE_GPU_NVIDIA_SAMPLING=ON|OFF|AUTO(default:AUTO):ON: check whether NVIDIA GPU information can be sampled and fail if this is not the caseAUTO: check whether NVIDIA GPU information can be sampled but do not fail if this is not the caseOFF: do not check whether NVIDIA GPU information can be sampled
HWS_ENABLE_GPU_AMD_SAMPLING=ON|OFF|AUTO(default:AUTO):ON: check whether AMD GPU information can be sampled and fail if this is not the caseAUTO: check whether AMD GPU information can be sampled but do not fail if this is not the caseOFF: do not check whether AMD GPU information can be sampled
HWS_ENABLE_GPU_INTEL_SAMPLING=ON|OFF|AUTO(default:AUTO):ON: check whether Intel GPU information can be sampled and fail if this is not the caseAUTO: check whether Intel GPU information can be sampled but do not fail if this is not the caseOFF: do not check whether Intel GPU information can be sampled
HWS_ENABLE_ERROR_CHECKS=ON|OFF(default:OFF): enable sanity checks during hardware sampling, may be problematic with smaller sample intervalsHWS_SAMPLING_INTERVAL=100ms(default:100ms): set the sampling interval in millisecondsHWS_ENABLE_PYTHON_BINDINGS=ON|OFF(default:ON): enable Python bindings
Installing via CMake
The library supports the install target:
bash
cmake --install . --prefix "/home/myuser/installdir"
Afterward, the necessary exports should be performed:
bash
export CMAKE_PREFIX_PATH=${CMAKE_INSTALL_PREFIX}/share/hws/cmake:${CMAKE_PREFIX_PATH}
export LD_LIBRARY_PATH=${CMAKE_INSTALL_PREFIX}/lib:${CMAKE_INSTALL_PREFIX}/lib64:${LD_LIBRARY_PATH}
export CPLUS_INCLUDE_PATH=${CMAKE_INSTALL_PREFIX}/include:${CPLUS_INCLUDE_PATH}
export PYTHONPATH=${CMAKE_INSTALL_PREFIX}/lib:${CMAKE_INSTALL_PREFIX}/lib64:${PYTHONPATH}
Note: when using Intel GPUs, the CMAKE_MODULE_PATH should be updated to point to our cmake directory containing the
Findlevel_zero.cmake file and export ZES_ENABLE_SYSMAN=1 should be set.
Installing via pip
The library is also available via pip:
bash
pip install hardware-sampling
This pip install behaves as if no additional CMake options were provided.
This means that only the hardware is supported for which the respective vendor libraries was available at the point of the pip install hardware-sampling invocation.
Available samples
The sampling type fixed denotes samples that are gathered once per hardware samples like maximum clock frequencies or
temperatures or the total available memory.
The sampling type sampled denotes samples that are gathered during the whole hardware sampling process like the
current clock frequencies, temperatures, or memory consumption.
General samples
| sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs | |:--------------------|:-----------:|:-----------:|:-----------:|:---------:|:-------------:| | architecture | fixed | str | str | str | - | | byteorder | fixed | str | str (fix) | str (fix) | str (fix) | | numcores | fixed | int | int | - | - | | numthreads | fixed | int | - | - | - | | threadspercore | fixed | int | - | - | - | | corespersocket | fixed | int | - | - | - | | numsockets | fixed | int | - | - | - | | numanodes | fixed | int | - | - | - | | vendorid | fixed | str | str (fix) | str | str (PCIe ID) | | name | fixed | str | str | str | str | | flags | fixed | list of str | - | - | list of str | | persistencemode | fixed | - | bool | - | - | | standbymode | fixed | - | - | - | str | | numthreadspereu | fixed | - | - | - | int | | eusimdwidth | fixed | - | - | - | int | | computeutilization | sampled | % | % | % | - | | memoryutilization | sampled | - | % | % | - | | ipc | sampled | float | - | - | - | | irq | sampled | int | - | - | - | | smi | sampled | int | - | - | - | | poll | sampled | int | - | - | - | | pollpercent | sampled | % | - | - | - | | performance_level | sampled | - | int | str | - |
clock-related samples
| sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs | |:-----------------------------------|:-----------:|:----:|:-----------:|:-----------:|:-----------:| | autoboostedclockenabled | fixed | bool | bool | - | - | | clockfrequencymin | fixed | MHz | MHz | MHz | MHz | | clockfrequencymax | fixed | MHz | MHz | MHz | MHz | | memoryclockfrequencymin | fixed | - | MHz | MHz | MHz | | memoryclockfrequencymax | fixed | - | MHz | MHz | MHz | | socketclockfrequencymin | fixed | - | - | MHz | - | | socketclockfrequencymin | fixed | - | - | MHz | - | | smclockfrequencymax | fixed | - | MHz | - | - | | availableclockfrequencies | fixed | - | map of MHz | list of MHz | list of MHz | | availablememoryclockfrequencies | fixed | - | list of MHz | list of MHz | list of MHz | | clockfrequency | sampled | MHz | MHz | MHz | MHz | | averagenonidleclockfrequency | sampled | MHz | - | - | - | | timestampcounter | sampled | MHz | - | - | - | | memoryclockfrequency | sampled | - | MHz | MHz | MHz | | socketclockfrequency | sampled | - | - | MHz | - | | smclockfrequency | sampled | - | MHz | - | - | | overdrivelevel | sampled | - | - | % | - | | memoryoverdrivelevel | sampled | - | - | % | - | | throttlereason | sampled | - | bitmask | - | bitmask | | throttlereasonstring | sampled | - | str | - | str | | memorythrottlereason | sampled | - | - | - | bitmask | | memorythrottlereasonstring | sampled | - | - | - | str | | autoboostedclock | sampled | - | bool | - | - | | frequencylimittdp | sampled | - | - | - | MHz | | memoryfrequencylimittdp | sampled | - | - | - | MHz |
power-related samples
| sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs |
|:-------------------------------|:-----------:|:---------------------------------:|:-----------:|:--------------------------------------------------------------------------------------:|:----------------------------------------------------:|
| powermanagementlimit | fixed | - | W | W | - |
| powerenforcedlimit | fixed | - | W | W | W |
| powermeasurementtype | fixed | str (fix) | str | str | str |
| powermanagementmode | fixed | - | bool | - | bool |
| availablepowerprofiles | fixed | - | list of int | list of str | - |
| powerusage | sampled | W | W | W | W
(calculated via powertotalenergyconsumption) |
| corewatt | sampled | W | - | - | - |
| dramwatt | sampled | W | - | - | - |
| packageraplthrottling | sampled | % | - | - | - |
| dramraplthrottling | sampled | % | - | - | - |
| powertotalenergyconsumption | sampled | J
(calculated via powerusage) | J | J
(calculated via powerusage if
powertotalenergyconsumption isn't available) | J |
| power_profile | sampled | - | int | str | - |
memory-related samples
| sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs |
|:----------------------------|:-----------:|:----:|:-----------:|:--------:|:------------------------------:|
| cachesizeL1d | fixed | str | - | - | - |
| cachesizeL1i | fixed | str | - | - | - |
| cachesizeL2 | fixed | str | - | - | - |
| cachesizeL3 | fixed | str | - | - | - |
| memorytotal | fixed | B | B | B | B
(map of memory modules) |
| visiblememorytotal | fixed | - | - | B | B
(map of memory modules) |
| swapmemorytotal | fixed | B | - | - | - |
| numpcielanesmin | fixed | - | - | int | - |
| numpcielanesmax | fixed | - | int | int | int |
| pcielinkgenerationmax | fixed | - | int | - | int |
| pcielinkspeedmax | fixed | - | MBPS | - | MBPS |
| pcielinktransferratemin | fixed | - | - | MT/s | - |
| pcielinktransferratemax | fixed | - | - | MT/s | - |
| memorybuswidth | fixed | - | Bit | - | Bit
(map of memory modules) |
| memorynumchannels | fixed | - | - | - | int
(map of memory modules) |
| memoryused | sampled | B | B | B | B
(map of memory modules) |
| memoryfree | sampled | B | B | B | B
(map of memory modules) |
| swapmemoryused | sampled | B | - | - | - |
| swapmemoryfree | sampled | B | - | - | - |
| numpcielanes | sampled | - | int | int | int |
| pcielinkgeneration | sampled | - | int | - | int |
| pcielinkspeed | sampled | - | MBPS | - | MBPS |
| pcielinktransferrate | sampled | - | - | T/s | - |
temperature-related samples
| sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs | |:------------------------|:-----------:|:----:|:-----------:|:--------:|:----------:| | numfans | fixed | - | int | int | int | | fanspeedmin | fixed | - | % | - | - | | fanspeedmax | fixed | - | % | RPM | RPM | | temperaturemin | fixed | - | - | °C | - | | temperaturemax | fixed | - | °C | °C | °C | | memorytemperaturemin | fixed | - | - | °C | - | | memorytemperaturemax | fixed | - | °C | °C | °C | | hotspottemperaturemin | fixed | - | - | °C | - | | hotspottemperaturemax | fixed | - | - | °C | - | | hbm0temperaturemin | fixed | - | - | °C | - | | hbm0temperaturemax | fixed | - | - | °C | - | | hbm1temperaturemin | fixed | - | - | °C | - | | hbm1temperaturemax | fixed | - | - | °C | - | | hbm2temperaturemin | fixed | - | - | °C | - | | hbm2temperaturemax | fixed | - | - | °C | - | | hbm3temperaturemin | fixed | - | - | °C | - | | hbm3temperaturemax | fixed | - | - | °C | - | | globaltemperaturemax | fixed | - | - | °C | °C | | fanspeedpercentage | sampled | - | % | % | % | | temperature | sampled | °C | °C | °C | °C | | memorytemperature | sampled | - | - | °C | °C | | hotspottemperature | sampled | - | - | °C | - | | hbm0temperature | sampled | - | - | °C | - | | hbm1temperature | sampled | - | - | °C | - | | hbm2temperature | sampled | - | - | °C | - | | hbm3temperature | sampled | - | - | °C | - | | globaltemperature | sampled | - | - | - | °C | | psutemperature | sampled | - | - | - | °C | | coretemperature | sampled | °C | - | - | - | | corethrottlepercent | sampled | % | - | - | - |
gfx-related (iGPU) samples
| sample | sample type | CPUs | |:--------------------------|:-----------:|:----:| | gfxrenderstatepercent | sampled | % | | gfxfrequency | sampled | MHz | | averagegfxfrequency | sampled | MHz | | gfxstatec0percent | sampled | % | | cpuworksforgpupercent | sampled | % | | gfxwatt | sampled | W |
"idle states"-related samples
| sample | sample type | CPUs | |:-------------------------------------|:-----------:|:-------------:| | idlestates | fixed | map of values | | allcpusstatec0percent | sampled | % | | anycpustatec0percent | sampled | % | | lowpoweridlestatepercent | sampled | % | | systemlowpoweridlestatepercent | sampled | % | | packagelowpoweridlestate_percent | sampled | % |
Example Python usage
```python import HardwareSampling as hws import numpy as np import matplotlib.pyplot as plt import matplotlib.dates as mdates import datetime
sampler = hws.CpuHardwareSampler()
could also be, e.g.,
sampler = hws.GpuNvidiaHardwareSampler()
sampler.start()
sampler.add_event("init") A = np.random.rand(2 ** 14, 2 ** 14) B = np.random.rand(2 ** 14, 2 ** 14)
sampler.add_event("matmul") C = A @ B
sampler.stop() sampler.dump_yaml("track.yaml")
plot the results
timepoints = sampler.relativetime_points()
plt.plot(timepoints, sampler.clocksamples().getclockfrequency(), label="average") plt.plot(timepoints, sampler.clocksamples().getaveragenonidleclock_frequency(), label="average non-idle")
axes = plt.gcf().axes[0] xbounds = axes.getxlim() for event in sampler.getrelativeevents()[1:-1]: axes.axvline(x=event.relativetimepoint, color='r') axes.annotate(text=event.name, xy=(((event.relativetimepoint - xbounds[0]) / (xbounds[1] - x_bounds[0])), 1.025), xycoords='axes fraction', rotation=270)
plt.xlabel("runtime [ms]") plt.ylabel("clock frequency [MHz]") plt.legend() plt.show() ```
License
The hws library is distributed under the MIT license.
Owner
- Name: Scientific Computing (SC) and Simulation of Large Systems (SGS) @ University of Stuttgart
- Login: SC-SGS
- Kind: organization
- Repositories: 8
- Profile: https://github.com/SC-SGS
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: hws - Hardware Sampling for GPUs and CPUs
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Marcel
family-names: Breyer
email: Marcel.Breyer@ipvs.uni-stuttgart.de
affiliation: University of Stuttgart
orcid: 'https://orcid.org/0000-0003-3574-0650'
- given-names: Alexander
family-names: Van Craen
email: Alexander.Van-Craen@ipvs.uni-stuttgart.de
affiliation: University of Stuttgart
orcid: 'https://orcid.org/0000-0002-3336-7226'
- given-names: Dirk
family-names: Pflüger
email: Dirk.Pflueger@ipvs.uni-stuttgart.de
orcid: 'https://orcid.org/0000-0002-4360-0212'
affiliation: University of Stuttgart
repository-code: 'https://github.com/SC-SGS/hardware_sampling'
license: MIT
version: v1.1.1
date-released: '2025-04-29'
GitHub Events
Total
- Create event: 4
- Release event: 2
- Issues event: 1
- Watch event: 12
- Issue comment event: 3
- Push event: 24
- Pull request event: 1
Last Year
- Create event: 4
- Release event: 2
- Issues event: 1
- Watch event: 12
- Issue comment event: 3
- Push event: 24
- Pull request event: 1
Dependencies
- actions/checkout v4.2.0 composite
- peaceiris/actions-gh-pages v4 composite