Recent Releases of umpire

umpire - v2025.03.1

New Features

  • Added DeviceIpcAllocator strategy to allow interprocess device shared memory.

- C++
Published by davidbeckingsale 8 months ago

umpire - v2025.03.0

New Features

  • Added a ResourceAwarePool which allows memory from the same pool to be used (1) across multiple device streams and (2) in a single memory space environment without the potential to cause data races.
  • Added MPI3 Shared Memory Allocators which uses MPI3 capabilities for Shared Memory.
  • Added a NamingShim strategy to allow users to allocate IPC shared memory without providing a name.

Bug Fixes

  • Fixed a minor memory leak when using IPC Shared Memory Allocators.

Improvements

  • A gettotalbytes_allocated function was implemented which returns the total amount of bytes allocated with Umpire allocators.
  • The NamedAllocationStrategy can now be used with IPC Shared Memory Allocators.
  • Additional documentation for Shared Memory Allocators was created and reorganized.
  • Additional documentation on requirements for Windows builds was added to the cmake.

- C++
Published by kab163 11 months ago

umpire - v2024.07.0

Changes Impacting Builds

This release of Umpire contains new build requirements including:

  • Cmake version 3.23 or later is required.
  • Camp version v2024.07.0 or later is required.

Bug Fixes

  • Umpire uses Fortran_FORMAT to avoid compilation errors when using LLVM flang.

Improvements

  • SYCL and Intel builds were added to Umpire`s DockerFile in addition to other builds in the Github workflow for better testing.

- C++
Published by kab163 over 1 year ago

umpire - v2024.02.1

Bug Fixes

  • Make usage of fmt header-only by default

- C++
Published by davidbeckingsale almost 2 years ago

umpire - v2024.02.0

New Features

  • Allow using external fmt

Improvements

  • Change default heuristic to percent_releasable_hwm(100)
  • Making umpire_device a separate library
  • Use GNUInstallDirs to set installation directories

Bug Fixes

  • Make allocate API threadsafe
  • Unroll product loop to fix overflow in Fortran wrappers
  • Fix typo in HostSharedMemoryResourceImpl causing infinite loop

- C++
Published by davidbeckingsale almost 2 years ago

umpire - v2023.06.0

New Features

  • Add _hwm variant of pool heuristic functions. These reallocate to the high watermark value after a coalescing, and can reduce memory overhead.

Improvements

  • Support 2023 OneAPI release.

  • Add HIP support to the DeviceAllocator.

Bug Fixes

  • Prefix fmt macros to avoid conflicts with other libraries including fmt.

  • Replace deprecated random_shuffle usage with shuffle for c++17.

  • Make exported include directories relative to install prefix.

- C++
Published by davidbeckingsale over 2 years ago

umpire - v2022.10.0

New Features

  • New HIP Advise operations have been added for setting and unsetting of the READ_MOSTLY, PREFERRED_LOCATION, and ACCESSED_BY advice.

For HIP versions >= 5, operations to set and unset COARSE_GRAIN has also been added.

  • getCurrentSize and getTotalSize methods were added to the DeviceAllocator API.

  • New event tracking has been added that can stream events to JSON (for replays) and SQLITE.

  • UMAP allocation resource has been added.

Improvements

  • Using try_get for async device operations.

Bug Fixes

  • Fixed build problem for Fortan builds by properly installing the Umpire module files.

  • Fixed build problem on certain configurations where the std::filesystem check was returning incorrect results.

  • Instead of throwing an exception, the is_device_allocator helper function now returns false if the DeviceAllocator object has not yet been instantiated.

  • Fixed FixedMallocPool on Windows, allowing the QuickPool allocator to be safely copied.

Build Configuration Updates

  • Add UMPIRE_DISABLE_ALLOCATIONMAP_DEBUG (default is OFF) option that allows users to disable the AllocationMap from dumping all records when an allocation is not found in a debug build.

  • Using UMPIRE_ENABLE_MPI instead of ENABLE_MPI for Umpire-specific MPI capabilities

  • Using UMPIRE_ENABLE_IPC_SHARED_MEMORY instead of ENABLE_IPC_SHARED_MEMORY to indicate this is an Umpire-specific implementation feature.

- C++
Published by mcfadden8 over 3 years ago

umpire - v2022.03.1

This is a patch release of v2022.03 that fixes reported build errors by setting UMPIRE_ENABLE_DOCS back to OFF by default since building documentation sets requires additional tools to build properly.

- C++
Published by mcfadden8 almost 4 years ago

umpire - v2022.03.0

v2022.03.0

Changes Impacting Builds

This release of Umpire contains new build requirements including:

  • C++14 is now required to build Umpire
  • CMake version 3.14 or later is required
  • The install location for umpire-config.cmake has changed from $(UMPIRE_INSTALL)/share/umpire/cmake to $(UMPIRE_INSTALL)/lib/cmake/umpire.

Changes Impacting C/Fortran

  • The CMake object library for C/FORTRAN interface has been reorganized. (NOTE: This is a breaking change since the include paths are now different.)
  • The C/FORTRAN interface header files have moved from umpire/interface/ to umpire/interface/c_fortran/ so including files will need to be updated in order to find them.

New Interfaces

  • Added a getDeviceAllocator function that allows users to get a DeviceAllocator object from the kernel without explicitly passing the allocator to the kernel first.
  • Added a reset function to the DeviceAllocator so that old data can be rewritten.
  • Expose PREFETCH operations registered with the MemoryOperationRegistry with a new ResourceManager::prefetch method.

Removed Interfaces

The following functions previously marked as deprecated have now been removed:

  • DynamicPoolMap and DynamicPool aliases removed
  • registerAllocator and isAllocatorRegistered removed

Fixes

  • Fixed a cmake install config issue so that now users can find a package of Umpire with a version constraint.
  • Fix ResourceManager::isAllocator to work for resources
  • Fix comparison operators for TypedAllocators
  • Fix host and device Allocator ID overlap
  • Remove null and zero-byte pool from list of valid allocators

New Configuration Options

  • The UMPIRE_ENABLE_DEVICE_ALLOCATOR option was added to control whether or not the DeviceAllocator class is included in the library. The default is "Off".

Build/Deployment Improvements

  • C/FORTRAN API is now auto generated
  • The umpire-config.cmake package is now relocatable
  • Use blt namespace for hip targets
  • Umpire CMakeList options now have UMPIRE_ prefixes and are now dependent upon corresponding BLT options.
  • Removed hardcoded -Xcompiler -mno-float128 for GCC 8+ with CUDA on PowerPC.
  • Build Doxygen documentation on ReadTheDocs.

Continuous Integration Updates

  • Add CI job with interprocess shared memory and CUDA
  • Add CI containers to allow for gcc{7,8,9}, clang{11,12}, and nvcc{10,11}
  • Add CI to check pools work with DEVICE_CONST memory

- C++
Published by mcfadden8 almost 4 years ago

umpire - v6.0.0

Added documentation on allocator (in)accessibility as well as getAllocator usage.

Added a Release function to FixedPool and corresponding gtest in strategy_tests

Installed thirdparty exports in CMake configuration file

Replay will now display high water mark statistics per allocator.

Initial support for IPC Shared Memory via a "SHARED" resource allocator. IPC Shared memory is initially available on the Host resource and will default to the value of ENABLE_MPI.

Added getcommunicatorfor_allocator to get an MPI Communicator for the scope of a shared allocator.

Added Allocator::getStrategyName() to get name of the strategy used.

Added getActualHighwatermark to all pool strategies, returns the high water value of getActualSize.

Added umpire::mark_event() to mark an event during Umpire lifecycle

Added asynchronous memset and reallocate operations for CUDA and HIP.

Added support for named allocations.

DynamicPoolMap marked deprecated. QuickPool should be used instead.

Refactored pool coalesce heuristic API to return either 0 or the minimum pool size to allocate when a coalesce is to be performed. No functional change yet.

All asynchronous operations now return a camp::resources::EventProxy to avoid the overhead of creating Events when they are unused.

Removed all internal tracking, allocations are only tracked at the Allocator level.

- C++
Published by kab163 over 4 years ago

umpire - v5.0.1 Release

v5.0.1

  • Fixed bug where zero-byte allocations from Umpire were sometimes incorrectly reported as not being Umpire allocations

- C++
Published by mcfadden8 almost 5 years ago

umpire - v5.0.0 - Sleepless in Umpire

  • Memory Resource header and source files for HIP.

  • Unified Memory support for HIP, including testing and benchmarking (temp support for Fortran).

  • Added a getParent functionality for retrieving the memory resource of an allocator.

  • Added an allocator accessibility functionality for checking if an allocator is accessible given a certain platform.

  • Changed enumeration names from all upper case to all lower case in order to avoid name collisions.

  • Fixed up broken source links in tutorial documentation.

  • registerAllocator is deprecated, addAlias should be used instead.

  • Moved backend-specific resource code out of ResourceManager and into resource::MemoryResourceRegistry.

  • Fixed accounting for number of releasable bytes in Quickpool that was causing coalesce operations to not work properly.

- C++
Published by kab163 over 5 years ago

umpire - v4.1.2 Release

  • Added workaround for incorrect nvcc compiler warning: "warning: missing return statement at end of non-void function" occuring in one Umpire's header files.

- C++
Published by mcfadden8 over 5 years ago

umpire - v4.1.1 Release

v4.1.1

  • Fixed DynamicPoolMap deallocate to make coalesce check O(1) again.

  • Initialize mdefaultallocator to HOST if not set explicitly.

- C++
Published by mcfadden8 over 5 years ago

umpire - v4.1.0

v4.1.0

  • QuickPool available via the C & Fortran APIs.

  • Resources are now created on-demand when accessed for the first time.

  • Peer access is no longer automatically enabled for CUDA and HIP.

  • Added cmake check to deterime if build subsystem capable of ASAN.

  • Fixed ASAN poisoning to limit it to what user originally requested and not rounded amount.

  • Improved resilliance of primary pool destructors so that giving back previously allocated blocks to a device that has already been cleaned up will no longer throw an error, but instead will now be logged and ignored.

- C++
Published by mcfadden8 over 5 years ago

umpire - Release v4.0.1

  • Fixed Umpire builds with MPI enabled

  • Added missing wrapUmpire.hpp to installation directory

- C++
Published by mcfadden8 over 5 years ago

umpire - v4.0.0

This release is not ABI compatible with 3.x releases, hence the major version number bump. It includes a number of new features and bug fixes, including:

  • Added a FILE memory resource that allocates memory using mmap'd files. This can be used to allocate memory from the burst buffers on machines like Sierra and Lassen.
  • All pools now have an "alignment" parameter that can be provided to the constructor.
  • MemoryResourceTraits now includes a resource member that can be used to identify the underlying resource for any Allocator.
  • Bundled tpl cxxopts has been replaced by CLI11 (only used when ENABLE_TOOLS=On)
  • Fixed memory leaks in DynamicPoolList, QuickPool.
  • Fixed reallocate operation when called on an allocation from a pool.

Please download the umpire-4.0.0.tar.gz file below, rather than the files generated automatically by Github, as these do not include all the necessary submodule code.

- C++
Published by davidbeckingsale over 5 years ago

umpire - v3.0.0

This release is not ABI compatible with 2.x releases, hence the major version number bump. It includes a number of new features and bug fixes, including: - Added support for multiple GPU devices, detected and registered as "DEVICE_N" where N is the device number. - Added support for capturing function backtraces with allocations. - Added AlignedAllocator to provide aligned allocations for host memory. - Fixed builds using -stdlib=c++ - Switched to camp::Platform: Platform::cpu is now Platform::host

Please download the umpire-3.0.0.tar.gz file below, rather than the files generated automatically by Github, as these do not include all the necessary submodule code.

- C++
Published by davidbeckingsale over 5 years ago

umpire - v2.1.0

This minor release fixes a bug when calling reallocate with size 0. Additionally, the replay tool now supports replaying reallocate operations.

Please download the umpire-2.1.0.tar.gz file below, rather than the files generated automatically by Github, as these do not include all the necessary submodule code.

- C++
Published by davidbeckingsale about 6 years ago

umpire - v2.0.0

This release contains a number of changes and bugfixes, as well as some API changes that have resulted in the major version number bump:

  • ENABLE_DEVICE_CONST CMake option to control whether device constant memory is enabled. It is now disabled by default.
  • DeviceAllocator that provides a pool for allocations inside GPU kernels.
  • Added "unset" operations for removing CUDA memory advice.
  • Extended C/Fortran API with more allocation strategies.
  • NamedAllocator that allows creating a new allocator that passes allocations through to underlying strategy
  • UMPIRE_VERSION_X are now defined as macros, rather than constexpr variables
  • Fixed reallocate to properly handle case where size == 0
  • AllocationStrategy constructor parameters re-ordered for consistency

Please download the umpire-2.0.0.tar.gz file below, rather than the files generated automatically by Github, as these do not include all the necessary submodule code.

- C++
Published by davidbeckingsale about 6 years ago

umpire - v1.1.0

This release contains the following major changes:

  • Added symbol umpire_ver_1_detected to help detect version mismatches when linking multiple libraries that all use Umpire.

  • Re-introduced pool algorithm used in pre-1.0.0 releases as DynamicPoolList, and renamed current strategy to DynamicPoolMap. DynamicPool is now an alias to DynamicPoolMap.

  • Fix signature of C function umpire_resourcemanager_make_allocator_pool to take size_t not int.

Please download the umpire-1.1.0.tar.gz file, rather than the files generated automatically by Github, as these do not include all the necessary submodule code.

- C++
Published by davidbeckingsale over 6 years ago

umpire - v1.0.1

This release fixes a bug in the DynamicPool where blocks would be leaked if a minimum size block was allocated for a smaller allocation in certain circumstances.

- C++
Published by davidbeckingsale over 6 years ago

umpire - v1.0.0

The most visible changes in this release are:

  • Umpire is MPI-aware (outputs rank information to logs and replays) when configured with the option ENABLE_MPI=On, and umpire::initialize(MPI_Comm comm) must be called.

  • AllocationStrategies may be wrapped with multiple extra layers. To "unwrap" an Allocator to a specific strategy, the umpire::util::unwrap_allocator method can be used, for example:

auto dynamic_pool = umpire::util::unwrap_allocator<umpire::strategy::DynamicPool>(allocator);

This will impact users who have been using DynamicPool::coalesce. The cookbook recipe has been updated accordingly, and the previous code snippet can be used.

  • Umpire now directs log and replay output to files, one per process. The filenames can be controlled by the environment variable UMPIREOUTPUTBASENAME

  • ENABLE_CUDA now set to Off by default.

  • Allocations for 0 bytes now always return a valid pointer that cannot be read or written. These pointers can be deallocated.

Please see CHANGELOG.md for the complete set of changes.

- C++
Published by davidbeckingsale over 6 years ago

umpire - v0.3.5

Fixed

  • Off by one regression introduced in 0.3.4 in AllocationRecord::AllocationMap::findRecord causing it to incorrectly report offset of ptr+size_of_allocation as found.

- C++
Published by mcfadden8 over 6 years ago

umpire - v0.3.4

Fixed

  • Bug in AllocationMap::findRecord causing it to miss finding allocations that were zero bytes in length.

- C++
Published by davidbeckingsale over 6 years ago

umpire - v0.3.3

Added

  • NUMA strategy (umpire::strategy::NumaPolicy) that allows allocating memory an specific NUMA nodes.

  • Implemented << for Allocator, so that it can be printed directly.

  • Update getAllocator methods to print list of available Allocators if the requested Allocator cannot be found.

  • Replay now captures coalesce operations from strategy::DynamicPool so that these can be replayed.

  • The replay tool can produce an output file that can be used to verify the replayed events are correct.

  • Cookbook example for creating a pool in pinned memory using FORTRAN.

  • GitHub workflow to check for CHANGELOG updates.

  • Ability to print allocation records that only match a predicate, print_allocator_records() to get all records from a specific allocator, and a cookbook recipe to do that.

  • Dockerfile for multi-stage builds. Supports building Umpire with GCC, Clang, and CUDA

  • GitHub action to run Clang static analysis.

  • Replay now includes unique replay ID of the logging process to help distinguish processes in an multi-process run.

  • Umpire replay now takes a "--help" option and displays usage information.

Changed

  • Umpire now builds as a single library, libumpire.a or libumpire.so, rather than having one library per source subdirectory.

  • Removed shared_ptr usage entirely. Ownership of objects was never "shared" and the smart pointers added unecessary overhead.

  • Moved CHANGELOG to CHANGELOG.md.

Removed

  • The 'coalesce' method was removed from ResourceManager and now must be accessed directory. examples/cookbook/recipecoalescepool.cpp shows how to do this.

Fixed

  • Bug in ResourceManager::copy/memset when given a pointer offset into an allocation.

  • Memory leak in judyL2Array.

  • While replay already was recording release operations, the tool was not actually replaying them. A fix was implemented so that the replay tool will now also replay any captured release operations.

  • make docs used to fail, because the build was setup for Read the Docs. A fix was implemented so Doxygen and Sphinx can be run locally, for instance to test out new cookbooks.

  • REPLAY previously recorded some operations with multiple print statements causing REPLAY output from MPI runs to become mixed between multiple ranks. REPLAY has been modified to output each operation onto a single line.

- C++
Published by davidbeckingsale almost 7 years ago

umpire - v0.3.2

This release contains bug fixes, an updated C interface, and some improvements to Umpire's replay capability:

  • Fixed bug in Judy where an allocation may not have been correctly found.
  • Add new functions to create Allocators to the C interface, and ensure all C interface files are correctly installed
  • Fixed bugs in replay where some allocations were skipped, and added new tool to only replay AllocationMap operations
  • Fix AllocationMap::find where it would incorrectly return a record when the allocation should not have been found

Please download the umpire-0.3.2.tar.gz file, rather than the automatically generated files. These do not include all the necessary submodule code.

- C++
Published by davidbeckingsale almost 7 years ago

umpire - v0.3.1

This release contains some improvements and fixes:

  • Added a "cookbook" of examples on using Umpire in complex situations
  • Allow users to provide a heuristic to modify pool behavior, determining when unused blocks will be coalesced
  • Modify the CudaAdvice* operations to take a specific device id
  • Improve error message when running a CUDA-enabled version of Umpire on a machine without GPUs.

- C++
Published by davidbeckingsale about 7 years ago

umpire - v0.3.0

This release contains some new features and performance improvements.

  • Umpire now supports AMD GPUs. The option ENABLE_ROCM will add support for AMD GPUs on systems that have the ROCm software stack and hcc compiler installed.
  • The C/Fortran API has been updated and expanded. Further support will be coming in the next release.
  • DynamicPool has a couple of performance and usability improvements: reserved but unused memory can be returned to the system using the release() function; and we have added a ResourceManager::coalesce(Allocator allocator) method that can merge unused blocks into a single allocation that will improve performance and reduce memory overhead. The coalesce method is automatically called when a DynamicPool has no active allocations.

- C++
Published by davidbeckingsale about 7 years ago

umpire - 0.2.4

This release contains several new features, bug fixes, and enhancements!

New features include:

  • Support for allocating "constant" memory on CUDA GPUs
  • A way of toggling introspection, to improve performance for allocators where you don't need tracking

Fixes and enhancements are: - Ability to deallocate nullptr (it's a no-op) - Fixed a bug when compiling with clang that would cause problems in the AllocationMap - Ensure all classes with virtual functions have virtual destructors

- C++
Published by davidbeckingsale over 7 years ago

umpire - v0.2.3

This release adds initial thread-safety to data structures and provides a ThreadSafeAllocator that can be used to make any AllocationStrategy safe when shared across multiple threads.

It also updates the AllocationAdvisor so that "HOST"-based Allocators be used as arguments when applying memory advice (such as setting "HOST" as the preferred location for data allocated in unified memory).

- C++
Published by davidbeckingsale over 7 years ago

umpire - v0.2.2

This release adds the ability to get the actual amount of memory allocated by pools, as well as the total amount that was requested by the user. It also adds support for reallocating null pointers.

- C++
Published by davidbeckingsale over 7 years ago

umpire - v0.2.1

This release contains new functionality for applying cudaMemAdvise to allocations, as well as various performance improvements. It also adds a new AllocationStrategy: FixedPool, which can be used to efficiently allocated fixed-size objects. Finally, this release supports collecting statistics about allocations and operations , and outputting these in json format.

- C++
Published by davidbeckingsale over 7 years ago

umpire - v0.2.0

This release contains some performance improvements related to tracking information about allocations. It also contains a new emplace-style construction for Allocators.

N.B this change is not backwards compatible with 0.1.X releases

- C++
Published by davidbeckingsale over 7 years ago

umpire - v0.1.4

This release contains a bugfix for the POOL allocator, related to some of the allocations being unaligned. It also contains a performance fix to reduce the overhead of logging messages to disabled log levels.

- C++
Published by davidbeckingsale almost 8 years ago

umpire - v0.1.3

Initial release of Umpire, supports CPU and GPU (CUDA-based) allocations.

- C++
Published by davidbeckingsale almost 8 years ago