Recent Releases of likwid
likwid - likwid-5.4.1
Changelog 5.4.1
- Fixes linking errors due to missing
bstrlib.h - Fix for
likwid-benchkernelstream_mem - Fix builds with CUDA for versions 11.2 to 12.6
- Fix sysfeatures with
ACCESSMODE=perf_event - Add AMD Zen1 to Zen3 to sysfeatures
- Add support for apple-cpufreq and cppc driver to sysfeatures
- Fix sysfeatures on ARM architectures
- C
Published by github-actions[bot] over 1 year ago
likwid - likwid-5.4.0
After a year without a new version, we are happy to release a new version.
- Support for Intel Granite Rapids (core, energy, uncore)
- Support for Intel Sierra Forrest (core, energy, uncore)
- Support for AMD Bergamo (core, energy, uncore)
- Support for Nvidia Grace (core, uncore)
- Fix: AMD Zen4 DataFabric units
- Fix: Multi-socket RAPL measurements on Sapphire Rapids
- Fix: Energy unit for RAPL DRAM domain for SPR, GNR and SRF
- Fix: Discovery mechanism workaround for UPI and M3UPI units of SPR
- Fix: Intel Westmere Uncore with perf_event backend
- Fix: Fujitsu A64FX (more counters, fixed topology, ...)
- Container bridge to use LIKWID inside container
- Sysfeatures interface reworked with support for various architectures and libraries
- Fix build for NVMON for CUDA 12.6+
- likwid-mpirun: SLURM pinning with cpu_mask feature
- Update of internal hwloc (2.11.2) and Lua (5.4.7) version
If you are building for RHEL9 (or its derivatives), the perl-locale package has to be installed.
- C
Published by github-actions[bot] over 1 year ago
likwid - likwid-5.3.0
We are happy to release version 5.3.0 of LIKWID, the tools suite for performance oriented programmers. Thanks to all the contributors, especially HPE for the AMD ROCm backend.
Changelog for 5.3.0: - Support for Intel SapphireRapids (Core, Uncore, RAPL) - Support for AMD Zen4 (Core, Uncore, RAPL) - Support for Apple M1 - Support for AMD GPUs (MarkerAPI, F90 interface) - Support for AWS Graviton3 (ARM Neoverse V1) - Support for HiSilicon TSV110 - Fix of F90 interface installation - Support for extended umasks in ICX and SPR - Units for metrics in performance groups - Library calls to get meta information (version, supported features, etc.) - Some fixes for direct access mode - Some fixes for X86 RDPMC detection - Update of internal hwloc (2.9.3) and Lua (5.4.6) version - New experimental sysfeatures module
Note: For Intel SapphireRapids systems with HBM, LIKWID in perf_event access mode and /sys/devices/uncore_type_14_* devices, apply the attached patch. Thanks @Julius-Plehn
- C
Published by github-actions[bot] over 2 years ago
likwid - likwid-5.2.2
- Fix pin string parsing in pinning library
- Make
SBINpath configurable in build system - Add
PKGBUILDfor ArchLinux package builds - Remove
accessDaemondouble-fork in systemd environments - Group updates for L2/L3 (mainly AMD Zen)
- Fix multi-initialization in MarkerAPI
- Add energy event scaling for Fujitsu A64FX
- Nvmon: Use Cupti error string to get better warning/error messages
- Nvmon: Store events internally to re-use event strings in stopCounters
- AccessLayer: Catch SIGCHLD to stop sending requests to accessDaemon if it was killed
likwid-genTopoCfg: Update writing and reading of topology file- Add
INST_RETIRED_NOPevent for Intel Icelake (desktop & server) - Removed some memory leaks
- Improved checks for
RDPMCavailability - Add
TOPDOWN_SLOTSfor perf_event - Fix for systems with CPU sockets without hwthreads (A64FX FX1000)
- Fix if
HOMEenvironment variable is not set (systemd) - Reader function for
perf_event_paranoidin Lua to get state early likwid-mpirun: Sanitize np and ppn values to avoid crashes
Note: The groups MEM_DP and MEM_SP use only 6 of 8 memory controllers for Intel Icelake SP. The attached patch fixes both groups.
- C
Published by TomTheBear almost 4 years ago
likwid - likwid-5.2.1
We are happy to release a new bugfix version of the LIKWID tool suite.
- Add support for Intel Rocketlake and AMD Zen3 variant (Family 19, Model 0x50)
- Fix for perf_event multiplexing (important!)
- Fix for potential deadlock in MarkerAPI (thx @jenny-cheung)
- Build and runtime fixes for Nvidia GPU backend, updates for CUDA test codes
- peakflops kernel for ARMv8
- Updates for AMD Zen1/2/3 event lists and groups
- Support spaces in MarkerAPI region tags (thx @jrmadsen)
- Use 'online' cpulist instead of 'present'
- Switch CI from Travis-CI to NHR@FAU Cx services
- Document -reset and -ureset for likwid-setFrequencies
- Reset cpuset in unpinned runs
- Remove destructor in frequency module
- Check PID if given through --perfpid
- Intel Icelake: OFFCORE_RESPONSE events
- AccessDaemon: Check PCI init state before using it
- likwid-mpirun: Set mpi type for SLURM automatically
- likwid-mpirun: Fix for skip mask for OpenMPI
- Fix for
triad_sve*benchmarks
Note: The groups MEM_DP and MEM_SP use only 6 of 8 memory controllers for Intel Icelake SP. The attached patch fixes both groups.
- C
Published by TomTheBear over 4 years ago
likwid - likwid-5.2.0
We are happy to release a new major update of the LIKWID tool suite.
- Support for AMD Zen3 (Core + Uncore)
- Support for Intel IcelakeSP (Core + Uncore)
- New affinity code
- Fix for Ivybridge uncore code
- Bypass accessdaemon by using rdpmc instruction on x86_64
- Introduce notion of CPU die in topology module
- Use CPU dies for socket-lock for Intel CascadelakeAP
- Add environment variable LIKWIDIGNORECPUSET to break out of current CPUset
- Fixes for affinity module CPUlist sorting
- Build against system-installed hwloc
- Update for Intel SkylakeX/CascadelakeX L3 group
- Rename DataFabric events for all generations of AMD Zen
- Add static cache configuration for Fujitsu A64FX
- Add multiplexing checks for perf_event backend
- Fix for table width of likwid-topology after adding CPU die column
- Adding RasPi 4 with 32 bit OS as ARMv7
- Add default groups for Intel Icelake desktop
- Fix for likwid-setFrequencies to not apply minFreq when setting governor
- likwid-powermeter: Fix hwthread selection when run with -p
- likwid-setFrequencies: Get measured base frequency if register is not readable
- CLOCK group for all AMD Zen
- Fixes in Nvidia GPU support in NvMarkerAPI and topology module
WARNING: This version has bugs in the perfevent backend. The multiplexing checks cause problems.
WARNING: The benchmarks `triadsve` for ARM8 chips use only 3 instead of 4 streams.
*Note**: The groups MEM_DP and MEM_SP use only 6 of 8 memory controllers for Intel Icelake SP. The attached patch fixes both groups.
- C
Published by TomTheBear almost 5 years ago
likwid - likwid-5.1.1
Changelog for version 5.1.1:
- Support for Intel Cometlake desktop (Core + Uncore)
- Fix for topology module of Fujitsu A64FX
- Fix for Intel Skylake SP in SNC mode
- Fix for likwid-perfscope
- Fix for CLI argument parsing
- Updated group and data file checkers
- Vector sum benchmark in SVE
- FP_PIPE group for Fujitsu A64FX
- Maximal number of CLI arguments configurable in config.mk (currently 16384)
- Fix for cpulist_sort function
- Fix for Intel SkylakeSP/CascadelakeSP CBOX devices in perf_event mode
- Multiplexing-Fix for perf_event (with warning)
- Adjust CUDA function pointer names in topology_gpu to avoid name clashes
- Fix for Lua 5.1
- Fix for likwid-setFrequency when reading CPU base frequency
Note: This version does not contain any updates for AMD Zen3 and Intel IcelakeSP.
Note: Uncore measurements on Intel Cascadelake AP systems require an update of the topology module which will come in 5.2.0
WARNING: The benchmarks triad_sve* for ARM8 chips use only 3 instead of 4 streams.
- C
Published by TomTheBear about 5 years ago
likwid - likwid-5.1.0
Changelog for version 5.1.0:
- Support for Intel Icelake desktop (Core + Uncore)
- Support for Intel Icelake server (Core only)
- Support for Intel Tigerlake desktop (Core only)
- Support for Intel Cannonlake (Core only)
- Support for Nvidia GPUs with compute capability >= 7.0 (CUpti Profiling API)
- Initial support for Fujitsu A64FX (Core) including SVE assembly benchmarks
- Support for ARM Neoverse N1 (AWS Graviton 2)
- Support for AMD Zen3 (Core + Uncore but without any events)
- Check for Intel HWP
- Fix for TID filter of Skylake SP LLC filter0 register
- Fix for Lua 5.1
- Fix for likwid-mpirun skip masks
- Fortran90 interface for NvMarkerAPI (update)
- CPUisonline check to filter non-usable CPU cores
- Fix for freeMemory in NUMA module (with hwloc backend)
- Fix for likwid-setFrequencies
We want to thank Intel, AMD, AWS and the University of Regensburg for their support.
If you want to use this release in a publication, please cite: https://doi.org/10.5281/zenodo.4282696
- C
Published by TomTheBear over 5 years ago
likwid - likwid-5.0.2
Changelog for 5.0.2:
- Fix memory leak in calc_metric()
- New peakflops benchmarks in likwid-bench
- Fix for NUMA domain handling properly
- Improvements for perf_event backend
- Fix for perfctr and powermeter with perf_event backend
- Fix for likwid-mpirun for SLURM with cpusets
- Fix for likwid-setFrequencies in cpusets
- Update for POWER9 event list
- Updates for AMD Zen, Zen+ and Zen2 (events, groups)
- Fix for Intel Uncore events with same name for different devices
- Fix for file descriptor handling
- Fix for compilation with GCC10
- Remove sleep timer warning
- Update examples C-markerAPI and C-internalMarkerAPI
Note: If you want to use LIKWID 5.0.2 with Lua 5.1, please apply this patch
- C
Published by TomTheBear over 5 years ago
likwid - likwid-5.0.1
I'm happy to announce a new bugfix release of LIKWID 5.
- Some fixes for likwid-mpirun
- Fix for hybrid pinning with multiple hosts
- Fix for perf.groups without core-local events (switch to likwid-pin)
- Fix for command line parser
- For for mpiopts parameter
- Add UPMC as Uncore counter to splitUncoreEvents()
- Expand user-given input to abspath if possible
- Check for at least one executable in user-given command
- Add skip mask for SLURM + Intel OpenMP
- Check if user-given MPI type is available
- Fix for perf_event backend when used as root
- Include likwid-marker.h in likwid.h to not break old MarkerAPI code
- Enable build with ARM HPC compiler (ARMCLANG compiler setting)
- Fix creation of likwid-bench benchmarks on POWER platforms
- Fix for build system in NVIDIAINTERFACE=BUILDAPPDAEMON=true
- Update for executable tester
- Update for MPI+X test (X: OpenMP or Pthreads)
Merry Christmas
- C
Published by TomTheBear over 6 years ago
likwid - likwid-5.0.0
New version LIKWID 5.0.0
Changelog: - Support for ARM architectures. Special support for Marvell Thunder X2 - Support for IBM POWER architectures. Support for POWER8 and POWER9. - Support for AMD Zen2 microarchitecture. - Support for data fabric counters of AMD Zen microarchitecture - Support for Nvidia GPU monitoring (with NvMarkerAPI) - New clock frequency backend (with less overhead) - Generation of benchmarks for likwid-bench on-the-fly from ptt files - Switch back to C-based metric calculator (less overhead) - Interface function to performance groups, create your own. - Integration of GOTCHA for hooking into client application at runtime - Thread-local initialization of streams for likwid-bench - Enhanced support for SLURM with likwid-mpirun - New MPI and Hybrid pinning features for likwid-mpirun - Interface to enable the membind kernel memory policy - JSON output filter file (use -o output.json) - Update of internal HWLOC to 2.1.0
Note: The MarkerAPI Macros have been moved to a separate header "likwid-marker.h"
- C
Published by TomTheBear over 6 years ago
likwid - likwid-4.3.4
New bugfix release:
- Fix for detecting PCI devices if system can split up LLC and memory channels (Intel CoD or SNC)
- Don't pin accessDaemon to threads to avoid long access latencies due to busy hardware thread
- Fix for calculations in likwid-bench if streams are used for input and output
- Fix for LIKWIDMARKERREGISTER with perf_event backend
- Support for Intel Atom (Tremont) (nothing new, same as Intel Atom (Goldmont Plus))
- Workaround for topology detection if LLC and memory channels are split up. Kernel does not detect it properly sometimes. (Intel CoD or SNC)
- Minor updates for build system
- Minor updates for documentation
Notice: If you want to compile likwid-4.3.4 with ACCESSMODE=perf_event, please apply the attached patch before compiling.
- C
Published by TomTheBear about 7 years ago
likwid - likwid-4.3.3
- Fixes for likwid-mpirun
- Fixes for events of Intel Skylake SP and Intel Broadwell
- Support for Intel CascadeLake X (only new eventlist, uses code from Intel Skylake SP)
- Fix for bitmask creation in Lua
- Event options for perf_event backend
- New assembly benchmarks in likwid-bench
- MarkerAPI: Function to reset regions
- Some new performance groups (DIVIDE and TMA)
- Fixes for AMD Zen performance groups
- Fix when using topology input file
- Minor bugfixes
- C
Published by TomTheBear over 7 years ago
likwid - likwid-4.3.2
- Fix in internal metric calculator
- Support for Intel Knights Mill (core, rapl, uncore)
- Intel Skylake X: Some fixes for events and perf. groups
- Set KMPINITAT_FORK to bypass bug in Intel OpenMP memory allocator
- AMD Zen: Use RETIRED_INSTRUCTION instead of fixed-purpose counter for metric calculation
- All FLOPS_* groups now have vectorization ratio
- Fix for MarkerAPI with perf_event backend
- Fix for maximal/minimal uncore frequency
- Skip counters that are already in use, don't exit
- likwid-mpirun: minor fix when overloading a host
- Improved detection of PCI devices
- C
Published by TomTheBear about 8 years ago
likwid - likwid-4.3.1
Minor fixes for Intel Skylake and frequency module
- C
Published by TomTheBear over 8 years ago
likwid - likwid-4.3.0
- Support for Intel Skylake SP architecture (core, uncore, energy)
- Support for AMD Zen architecture (core, l2, energy)
- Support for Intel Goldmont Plus architecture
- Pinning strategy 'balanced'
- New Lua based calculator
- Support for Intel PState CPU frequency daemon
Minor: - Fixed MCDRAM measurements on Intel Xeon Phi (KNL) with perf_event back end
Merry Christmas
- C
Published by TomTheBear over 8 years ago
likwid - likwid-4.2.1
- Fix for logical selection strings
- likwid-agent: general update
- likwid-mpirun: Improved SLURM support
- likwid-mpirun: Print metrics sorted as they are listen in perf. group
- likwid-perfctr: Print metrics/events as header in timeline mode Redirect to file when -o switch is used
- likwid-setFrequency: Commandline options to set min, max and current frequency
- Pinning-Library: Automatically detect and skip shepard threads
- Intel Broadwell: Added support for E3 (like Desktop), Fix for L3 group
- Intel IvyBridge: Fix for PCU fixed-purpose counters
- Intel Skylake: Fix for events CYCLEACTIVITY, new event L2LINES_OUT
- Intel Xeon Phi (KNL): Fix for overflow register, Update for ENERGY group OFFCORE_RESPONSE events are now tile-specific
- Intel SandyBridge: Fix for L3CACHE group
- Event/Counter list contains only usable counters and events
- Fix and warning message for static library builds
- C
Published by TomTheBear almost 9 years ago
likwid - likwid-4.2.0
- Support for Intel Xeon Phi (Knights Landing): Core, Uncore, RAPL
- Support for Uncore counters of some desktop chips (SandyBridge, IvyBridge, Haswell, Broadwell and Skylake)
- Basic support for Linux perf_event interface instead of native access. Currently only core-local counters working, Uncore is experimental
- Support to build against a existing Lua installation (5.1 - 5.3 tested)
- Support for CPU frequency manipulation, Lua interface updated
- Access module checks for LLNL's msr_safe kernel module
- Support for counter registers that are only available when HyperThreading is off
- Socket measurements can be used for all cores on the socket in metric formulas.
The LIKWID team wishes Merry Christmas to everyone.
- C
Published by TomTheBear over 9 years ago
likwid - likwid-4.1.2
- Fix for likwid-powermeter: Use proper energy unit
- Fix for performance groups for Intel Broadwell (D/EP): DATA and FALSE_SHARE
- Reduce number of started access daemons
- Clean Uncore unit local control registers (needed for simultaneous use of LIKWID 3 and 4)
- Clean config, filter and counter registers at
*_finalizefunction - Fix for likwid-features and likwid-perfctr
- C
Published by TomTheBear almost 10 years ago
likwid - likwid-4.1.1
- Fix for Uncore handling for EP/EN/EX systems
- Minor fix for Uncore handling on Intel desktop systems
- Fix in generic readCounters function
- Support for Intel Goldmont (untested)
- Fixes for likwid-mpirun
- C
Published by TomTheBear almost 10 years ago
likwid - likwid-4.1.0
- Support for Intel Skylake (Core + Uncore)
- Support for Intel Broadwell (Core + Uncore)
- Support for Intel Broadwell D (Core + Uncore)
- Support for Intel Broadwell EP/EN/EX (Core + Uncore)
- Support for Intel Airmont (Core)
- Uncore support for Intel SandyBridge, IvyBridge and Haswell
- Performance group and event set handling in library
- Internal calculator for derived metrics
- Improvement of Marker API
- Get results/metrics of last measurement cycle
- Fixed most memory leaks
- Respect 'Intel PMU sharing guide'
- Update of internal Lua to 5.3
- More examples (C++11 threads,Cilk+, TBB)
- Test suite for executables and library
- Accuracy checker supports multiple CPUs
- Security checked access daemon
- Likwid-bench supports Integer benchmarks
- Likwid-bench selects interation count automatically
- Likwid-bench has new FMA related benchmarks
- Likwid-mpirun supports SLURM job scheduler
- Reintroduced tool likwid-features
- C
Published by TomTheBear about 10 years ago
likwid - likwid-4.0.1
- likwid-bench: Iteration determination is done serially
- likwid-bench: Manual selection of iterations possible
- likwid-perfctr: Set cpuset to all CPUs not only the first
- likwid-pin: Set cpuset to all CPUs not only the first
- likwid-accuracy.py: Enhanced plotting functions, use only instrumented likwid-bench
- likwid-accessD: Check for allowed register for PCI accesses
- Add models HASWELLM1 (0x45) and HASWELLM2 (0x46) to likwid-powermeter and likwid-accessD
- New test application using Cilk and Marker API
- New test application using C++11 threads and Marker API
- likwid-agent: gmetric version check for --group option and s/\s*/_/ in metric names
- likwid-powermeter: Print RAPL domain name
- Marker API: Initialize access already at likwid_markerInit()
- Marker API: likwid_markerThreadInit() only pins if not already pinned
- C
Published by TomTheBear almost 11 years ago
likwid - LIKWID Version 4.0.0
After about one year of development, We are happy to announce the newest version of LIKWID.
The features of LIKWID 4.0.0: - Support for Intel Broadwell - Uncore support for all Uncore-aware architectures - Nehalem (EX) - Westmere (EX) - SandyBridge EP - IvyBridge EP - Haswell EP - Measure multiple event sets in a round-robin fashion (no multiplexing!) - Event options to filter the counter increments - Whole LIKWID functionality is exposed as API for C/C++ and Lua - New functions in the Marker API to switch event sets and get intermediate results - Topology code relies on hwloc. CPUID is still included but only as fallback - Most LIKWID applications are written in Lua (only exception likwid-bench) - Monitoring daemon likwid-agent with multiple output backends - More performance groups
- C
Published by TomTheBear almost 11 years ago