Recent Releases of gridtools

gridtools - GridTools version 2.3.9

changes since 2.3.7

Dependencies

Removed the dependency on boost by packaging relevant headers (#1816, #1817, #1821)

Tests

Update gtest to v1.16.0 (#1823)

- C++
Published by havogt about 1 year ago

gridtools - GridTools version 2.3.8

changes since 2.3.7

Bug fixes

  • addition to fix copy assignment of tuple containing refs for 1-tuples #1812
  • detection of CUDA/HIP mode with CrayClang compiler #1813

- C++
Published by havogt over 1 year ago

gridtools - GridTools version 2.3.7

changes since 2.3.6

Bug fixes

  • fix copy assignment of tuple containing refs #1811

- C++
Published by havogt over 1 year ago

gridtools - GridTools version 2.3.6

changes since v2.3.5

Minor feature

  • fn: introduce fn::index as alias to positional #1806

Performance improvements

  • Loop Blocking for fn GPU Backend #1787
  • Use ‘const’ in Neighbor Table Value Types #1796
  • Remove ldgptr and Replace Functionality by constptr_deref, possible improvement for sid::composite #1810
  • Add Const to SID Neighbor Table Element Type #1808

Bug fixes

  • Respect ReadOnly Property in Nanobind Adapter #1809

- C++
Published by havogt over 1 year ago

gridtools - GridTools version 2.3.5

changes since v2.3.4

GridTools v2.3.5 requires CMake 3.21.0 or later to properly support HIP.

Performance improvements

  • Introduce ldgptr to Enable _ldg in Data Stores and simpleptrholder #1802
  • Use ‘const’ in Neighbor Table Value Types #1796

Bug fixes

  • Fixes for HIP detection for recent ROCm and CMake #1804
  • Fix GTASSUME for NVCC and Enable GTASSUME on Recent GCC Versions #1789
  • Fix include #1793
  • Improve tests #1794, #1799, #1798

- C++
Published by havogt over 1 year ago

gridtools - GridTools version 2.3.4

changes since v2.3.2 (v2.3.3 removed because it introduced a breaking change in the nanobind adapter)

Performance improvements

  • Introduce GTPROMISE for _builtin_assume by @havogt, @iomaganaris in https://github.com/GridTools/gridtools/pull/1785, https://github.com/GridTools/gridtools/pull/1788

Bug fixes

  • Bug: storage/gpu.h functions within CUDA_ARCH by @havogt in https://github.com/GridTools/gridtools/pull/1778
  • Update nanobind to v2 by @havogt in https://github.com/GridTools/gridtools/pull/1777, https://github.com/GridTools/gridtools/pull/1790
  • Update minimum required boost to 1.73 by @havogt in https://github.com/GridTools/gridtools/pull/1772

Tests

  • Add Missing fnunstructurednablafusedtupleoffields to Regression & Performance Tests by @fthaler in https://github.com/GridTools/gridtools/pull/1783
  • Improved Data Layout for Neighbor Tables by @fthaler in https://github.com/GridTools/gridtools/pull/1782

CI / Deployment

  • test NVHPC 23.9 by @havogt in https://github.com/GridTools/gridtools/pull/1769
  • build: Update deployment action with trusted publisher by @havogt in https://github.com/GridTools/gridtools/pull/1770
  • build: Add download wheel step in deployment action by @havogt in https://github.com/GridTools/gridtools/pull/1771
  • CI: test CUDA 12.4 by @havogt in https://github.com/GridTools/gridtools/pull/1773
  • GitHub actions: Update compiler versions in configure test by @havogt in https://github.com/GridTools/gridtools/pull/1760
  • Move to CUDA 11.2 for daint nvcc gcc by @havogt in https://github.com/GridTools/gridtools/pull/1786

- C++
Published by havogt almost 2 years ago

gridtools - GridTools version 2.3.3

Performance improvements

  • Introduce GTPROMISE for _builtin_assume by @havogt in https://github.com/GridTools/gridtools/pull/1785

Bug fixes

  • Bug: storage/gpu.h functions within CUDA_ARCH by @havogt in https://github.com/GridTools/gridtools/pull/1778
  • Update nanobind to v2 by @havogt in https://github.com/GridTools/gridtools/pull/1777
  • Update minimum required boost to 1.73 by @havogt in https://github.com/GridTools/gridtools/pull/1772

Tests

  • Add Missing fnunstructurednablafusedtupleoffields to Regression & Performance Tests by @fthaler in https://github.com/GridTools/gridtools/pull/1783
  • Improved Data Layout for Neighbor Tables by @fthaler in https://github.com/GridTools/gridtools/pull/1782

CI / Deployment

  • test NVHPC 23.9 by @havogt in https://github.com/GridTools/gridtools/pull/1769
  • build: Update deployment action with trusted publisher by @havogt in https://github.com/GridTools/gridtools/pull/1770
  • build: Add download wheel step in deployment action by @havogt in https://github.com/GridTools/gridtools/pull/1771
  • CI: test CUDA 12.4 by @havogt in https://github.com/GridTools/gridtools/pull/1773
  • GitHub actions: Update compiler versions in configure test by @havogt in https://github.com/GridTools/gridtools/pull/1760
  • Move to CUDA 11.2 for daint nvcc gcc by @havogt in https://github.com/GridTools/gridtools/pull/1786

- C++
Published by havogt almost 2 years ago

gridtools - GridTools version 2.3.2

Bug fixes

  • Apply workaround to CUDA 12.3, see #1766 (#1768)

- C++
Published by havogt over 2 years ago

gridtools - GridTools version 2.3.1

Python bindings

  • Python SID adapter: Add support for HIP/ROCm buffers (#1759)
  • Python bindings: Nanobind SID adapter (#1762)

Bug fixes

  • Partial workarounds for CUDA 12.1 and 12.2, see #1766 (#1764)
  • Fix GCC 13: add missing include (#1761)

- C++
Published by havogt over 2 years ago

gridtools - GridTools version 2.3.0

Support for NVHPC (#1747)

GridTools now supports NVHPC starting from release 23.3!

Parallel fn::backend::naive (#1746)

Naive (just parallel for, no blocking and other optimizations) OpenMP parallelization of the naive backend.

SID util to transform a dimension to a tuple_like element type (#1750)

Translates a SID with dimension D and element type T to a SID with D removed and type is tuple<T>-like, with tuplesize N for `sid::dimensiontotuplelike(sid)`.

Bug fixes and smaller features

  • fn: allow execution of stencils with 0d domain (#1728)
  • Make pybind11::buffer sid copyable (#1755)

Build fixes

  • Support for Clang 16 (#1751)

and other changes already included in v2.2.3

fn: SID neighbor table wrapper (#1730)

Adds a simple class that wraps a SID and implements the neighbour table concept. (Picked for convenience into 2.2.2.)

Support for Python packaging (#1720)

Starting with this release we will publish GridTools C++ on pypi.org to make it easier to consume GridTools C++ from GT4Py.

Bug fixes

  • Fix CUDA 12.0 compilation (#1741)
  • Improvements to Python packaging (#1742, #1743, #1744)
  • Fix get_keys of empty hymap (#1728)
  • fn: CUDA early exit on empty grid - an empty domain skips execution instead of erroring (#1729)
  • fn: prefer qualified names over ADL for fn builtins (they are not customization points for the user) (#1731, #1732)
  • Enable workarounds for CUDA 11.8 (#1734)
  • Enable workarounds for Clang 15 (#1735)
  • Update pybind11 version to fix wrong C++ standard (#1723)
  • Fix perfect forwarding in sid::composite::make_values (#1722)
  • Workaround for NVCC bug in gcl (present in 11.6, 11.7 and most likely in 11.8) (#1726)

Performance fixes

  • Alternative skip value check in fn, which improves CUDA performance (#1721)

Build fixes

  • Fix perftests CMake target when no tests are added (#1724)

Cleanup

  • Replace boost::variant by std::variant (#1718)

CI

  • Add gcc-11, gcc-12, CUDA 12.0 (#1738, #1739, #1740)

Contributions

This release contains contributions from @DropD, @egparedes, @fthaler, @havogt, @petiaccja

- C++
Published by havogt about 3 years ago

gridtools - GridTools version 2.0.1

Bug fixes

  • Fix: storage_gpu for HIPCC-AMDGPU (#1540)
  • Performance fix for C++17 (#1618)
  • Enable several CUDA workarounds for recent compilers (#1681 and others)
  • Some declarations to definitions
  • Workaround gtest incompatibilities in recent compilers

CI

  • Compile (and run) all tests on GitHub actions (Jenkins doesn't run on v2.0.x anymore)

- C++
Published by havogt about 3 years ago

gridtools - GridTools version 2.2.3

Bug fixes

  • Fix CUDA 12.0 compilation (#1741)
  • Improvements to Python packaging (#1742, #1743, #1744)

CI

  • Add gcc-11, gcc-12, CUDA 12.0 (#1738, #1739, #1740)

- C++
Published by havogt about 3 years ago

gridtools - GridTools version 2.2.2

fn: SID neighbor table wrapper (#1730)

Adds a simple class that wraps a SID and implements the neighbour table concept. (Picked for convenience into 2.2.2.)

Support for Python packaging (#1720)

Starting with this release we will publish GridTools C++ on pypi.org to make it easier to consume GridTools C++ from GT4Py.

Bug fixes

  • Fix get_keys of empty hymap (#1728)
  • fn: CUDA early exit on empty grid - an empty domain skips execution instead of erroring (#1729)
  • fn: prefer qualified names over ADL for fn builtins (they are not customization points for the user) (#1731, #1732)
  • Enable workarounds for CUDA 11.8 (#1734)
  • Enable workarounds for Clang 15 (#1735)

Build fixes

  • Fix perftests CMake target when no tests are added (#1724)

- C++
Published by havogt over 3 years ago

gridtools - GridTools version 2.2.1

Bug fixes

  • Update pybind11 version to fix wrong C++ standard (#1723)
  • Fix perfect forwarding in sid::composite::make_values (#1722)
  • Workaround for NVCC bug in gcl (present in 11.6, 11.7 and most likely in 11.8) (#1726)

Performance fixes

  • Alternative skip value check in fn, which improves CUDA performance (#1721)

Cleanup

  • Replace boost::variant by std::variant (#1718)

- C++
Published by havogt almost 4 years ago

gridtools - GridTools version 2.2.0

C++ standard upgraded to C++17

Starting with this version of GridTools, we require the C++17 standard (#1680) and improved the code base using C++17 features (#1693, #1716, #1697): - Get rid of tuple_util::make - GT_CONSTEXPR and GT_CONSTEXPR_TARGET goes away - wstd stuff goes away - is_trivially_copy_constructible check is consistently used instead of is_trivially_copyable where the data is passed host/device boundary, because it is exactly what is needed. - make_[smth] pattern is replaced to template argument deduction in several places, the old pattern is deprecated - composite is rewritten using c++17 - overload is rewritten using c++17 - std::[smth]_v<...> are used instead of std::[smth]<...>::value - static_assert(<cond>) used instead of static_assert(<cond>, "") - CTAD for simple_ptr_holder (#1701, #1708)

If you were using functionality from the internal library common you might have to update your code (all common is considered internal API, see Release process). The most common change is using CTAD instead of makers where possible. If not possible due to compiler bugs, the maker pattern was updated to be independant of tuple_util::make. E.g. replace

  • tuple_util::make<tuple>(...) by tuple(...)
  • tuple_util::make<array>(...) by array(...)
  • sid::composite::make<...>(...) by sid::composite::keys<...>::make_values(...)
  • tuple_util::make<hymap::keys<...>::values>(...) by hymap::keys<...>::make_values(...);

New library fn: functional model backend

The fn library provides functionality for the Declarative GT4Py to implement a backend for the functional model. It supports (naive, no-blocking) CPU and (efficient) GPU (CUDA) execution for structured (Cartesian) and unstructured grids. See examples in tests/regression/fn/. The library provides a high-level, human-readable frontend, but is mainly meant as a target for code generators.

  • Introduce functional model backend (#1648, #1666, #1679)
  • Implements fn::extents (#1683)
  • Column Stage (#1685)
  • New Backend Backends (#1695)
  • Fn Frontend (#1698)
  • Performance References for FN Backends (#1711)
  • Add fn::tupleget and fn::maketuple (#1713)
  • Allow setting CUDA stream (#1712)

Minor new features

  • int_vector library (#1672)
  • add conversion assign to hymap (#1702)

Minor improvements

  • Extensions to meta and hymap (#1663)
  • Soften sid value type requirements from std::trivially_copyable to std::trivially_copy_constructible (#1663)
  • is_tuple_like (#1676) and is_hymap (#1677)

Bug fixes

  • Propagate CXX_STANDARD to all tests (#1664)
  • Compilation fixes for nvcc 11.5 and clang 12 with std=c++20 (#1665)
  • Workaround for nvcc bug https://godbolt.org/z/orrev1xnM (#1681)
  • c_bindings example: fix typo and split cpu and gpu fortran sources (#1684)
  • Fix unused param warnings (#1706)
  • Fix Compilation with CUDA 11.6 (#1710)
  • Support for Clang 14 (#1707)

Testing

  • Add C++20 with Cray Clang on Piz Daint to Jenkins CI (#1675)
  • Perftest Updates (#1690)
  • CI dom: Downgrade to gcc 10.3 for CUDA toolkit support (#1699)

Contributions

This release contains contributions from @anstaf, @fthaler, @havogt.

- C++
Published by havogt almost 4 years ago

gridtools - GridTools version 2.1.0

New features

  • Dump backend: outputs a json representation of the stencil specification (#1456)
  • Reduction library with naive, CPU and GPU backends (#1590, #1594, #1619)
  • SID: Python cuda array interface support (#1596)

Extended features

  • Support for compile time length in data stores (#1545)
  • Several SID improvements (#1548)
  • Structured bindings support for gridtools tuple-like (#1556)
  • Improvements for Hugepage Allocation (#1562)
  • Add protection against misuse of device namespace (#1581)
  • fortranarrayview: allow to disable openacc (#1603)
  • Introduce sid::unknown_kind (#1605)

Non-functional changes

  • Hold the sids within sid::composite as tuple (#1564)
  • Various cleanups and c++17 related changes (#1579)
  • C++17 versions of meta::fold (#1549)
  • Sid as a proper C++20 concept (#1580, #1582)

Performance

  • More Inlining in cpu_kfirst Backend (#1634)
  • Support for Compile-Time Unit Stride Dimension for Python SID Adapter (#1635, #1651)

Bug fixes

  • K-cache fixes (#1530)
  • CMake: Fix storage_gpu for HIPCC-AMDGPU (#1540)
  • Remove a warning in hugepage_alloc which warns about a problem which only affects testing code (#1560)
  • Improve HIP + OpenMP Compilation (#1578)
  • Fix empty composite and add composite::make helper (#1583)
  • Fix asconst to work with any SID and be compatible with std::asconst (#1601, #1611)
  • SID composite: add static_assert against incorrect kinds (#1604)
  • Workaround a CUDA problem: tuple_util::concat remove constexpr var (#1606)
  • Improve Compliance with Parallel Model: Limit fusion of k-parallel execution with k-offsets (#1612)
  • GCC 9.x: Optimize multishift (#1630)
  • Python SID adapter: fix integer format check (#1632)
  • GCC 11.x: Compilation fixes (#1641, #1646)
  • Fixes for CUDA 11.4 (#1644)

Testing

  • Update to GTest v1.11 and minor changes to adapt for changed gtest interface (#1655)

Documentation

  • Clarifications to the execution model (#1541)

Contributions

This release contains contributions from @anstaf, @fthaler, @havogt, @lukasm91.

- C++
Published by havogt over 4 years ago

gridtools - GridTools version 1.1.4

Bug fixes

  • speedup compile time (#1608)
  • Support for GPU backend with custom block sizes in boundary conditions (#1438)
  • Fix sid shift origin (#1517)

Compatibility with new compilers

  • Added support for GCC 11.x (#1652, #1654)
  • Fix for CUDA 11 (#1520)

- C++
Published by havogt over 4 years ago

gridtools - GridTools version 2.0.0

GridTools v2.0.0

GridTools v2.0.0 comes with an improved API for stencil composition and storage construction. These changes and a few others (see below) are breaking changes.

Changes since v1.1.0

New API: Stencil Composition

The make_computation API for composing stencils is replaced by a new stencil specification API, e.g.

```cpp auto horizontaldiffusionspec = { GTDECLARETMP(double, lap, flx, fly); return st::executeparallel() .ijcached(lap, flx, fly) .stage(lapfunction(), lap, in) .stage(flxfunction(), flx, in, lap) .stage(flyfunction(), fly, in, lap) .stage(outfunction(), out, in, flx, fly, coeff); };

st::run(horizontaldiffusionspec, stencilbackendt(), grid, coeff, in, out); ```

instead of

```cpp auto horizontaldiffusion = gt::makecomputation(grid, pcoeff{} = coeff, gt::makemultistage(gt::enumtype::execute{}, definecaches(gt::cache<gt::IJ, gt::cacheiopolicy::local>(plap{}, pflx{}, pfly{})), gt::makestage<lapfunction>(plap{}, pin{}), gt::makeindependent(gt::makestage(pflx{}, pin{}, plap{}), gt::makestage(pfly{}, pin{}, plap{})), gt::makestage(pout{}, pin{}, pflx{}, pfly{}, p_coeff{})));

horizontaldiffusion.run(pin{} = in, p_out{} = out); ```

See the documentation and examples for details about the new API.

Related PRs: #1388

New API: Storage Builder

Datastores are now created using a builder API, e.g.

```cpp auto storagebuilder = gt::storage::builder<storagetraits_t>.dimensions(d1, d2, d3).halos(halo, halo, 0);

auto in = storagebuilder.type().value(42).build(); auto coeff = storagebuilder.type().value(42).build(); auto out = storage_builder.type().build(); ```

The type returned by the builder is a shared_ptr of a datastore (previously the `sharedptrwas inside thedata_store`)

Other storage related changes: - Memory alignment is applied in bytes (instead of in elements). - Host/device buffers are automatically synchronized on creation of views or on access of the underlying pointer (the sync method is removed).

See the documentation and examples for details about the new API.

Related PRs #1388, #1534

API break: New Backend names

Our backend names (cuda, mc, x86) where a source of confusion as the users had a certain (but wrong) idea of e.g. when to use x86.

The new names are (#1490): - gpu instead of cuda as the same backend works for HIP. - cpu_kfirst instead of x86, the innermost dimension is k, suitable for vertical stencils and architectures that emphasize caches over vector instructions. - cpu_ifirst instead of mc, the innermost dimension is i, suitable for modern CPUs where vector instructions are key for performance.

Additionally we introduced a new backend gpu_horizontal (#1445) which works only for pure horizontal (parallel) stencils. Performance of gpu_horizontal is improved over gpu for most stencils, however we recommend to benchmark both backends.

Other API breaking changes

  • Backend declarations (traits) are removed from common/defs.hpp and are now provided in component specific headers for stencil, timer, gcl and storage (#1388).
  • We improved the code structure by introducing finer-grained namespaces (#1388)
  • The storage repository was removed (#1456)

New functionality

  • New sid::rename_dimensions (#1533)
  • New regression test illustrating c-arrays as SIDs (#1525)
  • A Python SID adapter including regression test for calling computations from Python (#1523)
  • Introduced the threadpool concept (#1484, #1498, #1504) and added an HPX threadpool (#1437)
  • Added an example for calling CUDA GridTools computations from Fortran with OpenACC (#1454)

Improved functionality

  • GCL is now header-only (-> all GridTools is now header-only)
  • The CMake build scripts are rewritten, see the documentation and examples for how to use GridTools CMake targets (#1421, #1441, #1442, #1450, #1509)

Bug Fixes / Cleanup

  • Fixes to SID concept helpers (#1524, #1527, #1531)
  • Fixes for CUDA 11 (#1529), thanks @lukasm91
  • Fixes for HIP compilation (#1488)
  • Better error diagnostics at the frontend (#1495)
  • Performance tests are now included in a single binary (#1453)
  • Layout transformations are refactored (#1388)
  • and many other small fixes

Infrastructure/Development

  • Environments are renamed to describe more precisely what they are (#1507)
  • Added testing on the new MeteoSwiss machine Tsa to Jenkins (#1452)
  • Moved tests from Travis to GitHub actions (#1446), added tests for different CMake setups (#1443).
  • Added a Gitpod configuration (#1423)
  • Added testing with Clang-based Cray compiler on Daint (#1382)

Contributions

This release contains contributions from @anstaf, @fthaler, @havogt, @jdahm, @lukasm91, @mbianco, @tehrengruber, @wdeconinck.

- C++
Published by havogt almost 6 years ago

gridtools - GridTools version 2.0.0rc2

see final release

- C++
Published by havogt almost 6 years ago

gridtools - GridTools version 2.0.0rc1

see final release

- C++
Published by havogt almost 6 years ago

gridtools - GridTools version 1.1.3

Performance fixes

  • Revert a #pragma unroll to be optimal for the COSMO dycore on V100 (#1400)

Other

  • CMake: Add a missing policy workaround_mpi.cmake (#1398)

- C++
Published by havogt over 6 years ago

gridtools - GridTools version 1.1.2

Support for new targets

  • Support for clang-CUDA and HIP (#1361)

Fixes

  • Support custom block size in storage traits (#1392)
  • Add GT_FUNCTION to storage_info
  • CMake: export compilation type (#1387)

Infrastructure

  • Update testing environment after Piz Daint upgrade (squash of #1369, #1371, #1373, #1382)

- C++
Published by havogt over 6 years ago

gridtools - GridTools version 1.0.4

Fixes

  • CMake: support for superbuilds (nesting gridtools with add_subdirectory/FetchContent) #1383

- C++
Published by havogt over 6 years ago

gridtools - GridTools version 1.1.1

Fixes

  • Make computation API thread compatible by making the allocator thread_local (#1380).
  • CMake: fix to make GridTools work as nested project in a "superbuild" setup.

- C++
Published by havogt over 6 years ago

gridtools - GridTools version 1.0.3

Fixes

  • Fix a module in communication #1356
  • CMake: fix storage module #1353

- C++
Published by havogt over 6 years ago

gridtools - GridTools version 1.1.0

GridTools

In GridTools v1.1.0 we set the default C++ standard to C++14 and drop compatibility for C++11. This requires at least CUDA 9.0.

Changes since v1.0.0

Full introduction of the SID concept

The backend is completely restructured based on the SID (stencil iteratable data) concept. There should be no user facing changes as long as user code was only using documented public API (*). The changes separate backend implementation from the core library to allow non intrusive extension of the library with new backends. Additionally maintainability of the gridtools infrastructure is significantly improved. Performance should be improved in general, but might be worse for specific computations. A common pattern for performance improvement/degradation is not observed.

(*) There is one change which might trigger different behavior (though the old behavior was not documented): temporary fields are now implicitly 3 dimensional. Prior to this version the user could have abused a 2D temporary field for accumulating values between k-levels.

New

  • New example illustrating the type-erasure pattern for computations. #1318

Deprecation (support will be removed in GridTools v2.0.0)

  • Using the gridtools::cbindings is deprecated. Switch to the standalone https://github.com/GridTools/cppbindgen.
  • global_accessor is deprecated, use in_accessor (without extents) instead.
  • make_global_parameter with backend as template parameter is deprecated. The backend is not needed anymore.

Fixes / Cleanup

  • Fix performance for CUDA 9.2 / 10.0 #1281 #1327 #1339
  • Use c++14 features. #1307
  • Use multiple threads in storage Initialization. #1300
  • Remove dependency on boost::mpl and boost::fusion
  • Fixes required to compile gridtools with HIP-Clang. Full support for AMD GPUs via HIP-Clang will come in a next release. #1363
  • Fix a bug in communication #1355.
  • The global_parameter doesn't require pre-allocated storage (as it is now passed via constant memory in case of CUDA), therefore global_parameter is a lightweight wrapper around the value type, which can be created without overhead, e.g. when passing it to computation.run().

Infrastructure/Development

  • The bash build script is replaced by a python driven build process, see wiki for how to get the environment. #1273 #1298 #1341
  • Improved jenkins performance plots. #1301 #1338
  • Googletest is now pulled-in with CMake's FetchConent instead of having it as part of the repository. #1310

- C++
Published by havogt over 6 years ago

gridtools - GridTools version 1.0.2

Fixes

  • The workaround implemented in v1.0.1 did not fully recover CUDA 8.0 performance for CUDA >= 9.2. A further workaround now recovers performance. See #1326.
  • Make GT_DEFAULT_VERTICAL_BLOCK_SIZE macro modifiable for the user. See #1350.

- C++
Published by havogt almost 7 years ago

gridtools - GridTools version 1.0.1

Fixes

  • Workaround for performance regression in CUDA 9.2 and newer, see #1223.

- C++
Published by havogt almost 7 years ago

gridtools - GridTools version 1.0.0

GridTools

An introduction to GridTools can be found in the documentation. Functionality as described in the documentation is considered public API, other functionality is considered internal and might change without notice.

Upgrading from pre-release versions

In the process of finalizing GridTools v1.0, API was changed in many places in the past pre-release version. See the description of the releases for information on how to update to the latest API.

Changes since v0.21.0

API breaking changes

The backend strategy (naive/block) was removed and replaced by a separate naive backend. (#1238, #1240, #1244)

In the process, the target tags became obsolete as they were just referring to a backend. Therefore target was renamed to backend. To update apply the following changes - backend_t::make_global_parameter(…) is now make_global_parameter<backend_t>(). Same for update_global_parameter. - backend_t::storage_traits_t was removed, use storage_traits<backend_t> instead. - target::X is now called backend::X. - CMake variables GT_ENABLE_TARGET_X are renamed to GT_ENABLE_BACKEND_X.

Other

  • Already deprecated functions were removed (#1232)
  • Removed 2D and packing version from gcl (#1233)
  • The last 2 parameters of axis are encapsulated in types, and the order of these parameters is reversed, e.g. use axis<2, axis_config::offset_limit<4>> instead of axis<2,0,4> (#1257)
  • The call operator is removed from the global parameter (#1256)

- C++
Published by havogt about 7 years ago

gridtools - Preparation for public release

Changes since 0.20.0

API breaking changes

  • Conditionals (if_, switch_) are removed.
  • Rename all files and folders with - (dash) to _ (underscore).
  • Rename reactivate_device_write_views() to reactivate_target_write_views()
  • Removed the multiple-kernel implementation for boundary conditions.

Examples

  • Examples are now provided with standalone CMakeLists.txt. The examples are used as a test for the GridTools CMake installation in our regression tests.
  • C-bindings example was added.

Performance improvements

  • mc: changed loop order and added omp statement for boundary conditions

Bug fixes

  • Restores x86 performance, which was broken in 0.20.0.
  • Restores cuda performance for layout transformations, which was broken in 0.20.0.
  • Enable a workaround for CUDA 10.1 which already existed for CUDA < 10.1.
  • CMake: export the mpi workaround
  • CMake: fix a path for gt_bindings.cmake

- C++
Published by havogt about 7 years ago

gridtools - API changes in preparation for the public release

Changes since 0.19.0

API breaking changes

Naming changes

A lot of public GridTools functions, types and macros were renamed to consistently use lower-case

  • arg_list -> param_list as the elements are the parameters of the stencil operator (not the arguments).
  • Do-method → apply-method
  • enumtype::in and enumtype::inout -> intent::in, intent::inout
  • execute<enumtype::forward> etc. -> execute::forward
  • access_mode::ReadOnly, ReadWriteaccess_mode::read_only, read_write
  • cache_type::IJ, Kcache_type::ij, k
  • direction::I, J, Kdirection::i, j, k
  • ownership::ExternalCPU, ExternalGPUownership::external_cpu, external_gpu
  • STRUCTURED_GRIDSGT_STRUCTURED_GRIDS
  • FLOAT_PRECISIONGT_FLOAT_PRECISION
  • BACKEND_*GT_BACKEND_*
  • ENABLE_METERSGT_ENABLE_METERS
  • storage_info_interface -> storage_info

Removed

  • Removed axis<...>::with_offset_limit, axis<...>::with_extra_offsets as they were confusing. These options have to be set directly as template arguments to the axis.

Internal API changes

  • GRIDTOOLS_STATIC_ASSERTGT_STATIC_ASSERT
  • ASSERT_OR_THROWGT_ASSERT_OR_THROW
  • DISALLOW_COPY_AND_ASSIGNGT_DISALLOW_COPY_AND_ASSIGN
  • _USE_GPU_GT_USE_GPU
  • GTREPO_*GT_REPO_*
  • GRIDTOOLS_PP_*GT_PP_*
  • PEDANTICGT_PEDANTIC
  • VERBOSEGT_VERBOSE
  • RESTRICTGT_RESTRICT
  • __DISABLE_CACHING__GT_DISABLE_CACHING
  • META_STORAGE_INDEX_LIMITGT_META_STORAGE_INDEX_LIMIT
  • Removed ALLOW_EMPTY_EXTENTS, _USE_DATATYPES_
  • Added GT_-prefix to some file-local macros to minimize conflict probability.
  • _GCL_GPU_GCL_GPU
  • _GCL_MPI_GCL_MPI
  • CUDAMSGGCL_CUDAMSG
  • _GCL_CHECK_DESTRUCTORGCL_CHECK_DESTRUCTOR
  • HOSTWORKAROUNDGCL_HOSTWORKAROUND
  • NULLnullptr
  • Added GCL_-prefix to GCL macros.
  • Replaced GCL header guards by #pragma once

Other API changes

  • Structured grids is now the default
  • Users should use make_param_list to create the param_list instead of explicitly using boost::mpl::vector. In the future using boost::mpl::vector might not work anymore, the underlying type is implementation detail, not public API
  • cache_type is now an enum class. Update code by prefixing all ij and k with cache_type::
  • Introduces make_expandable_computation(expand_factor<N>, ...) and removes the respective overload of make_computation; and make_expandable_positional_computation(expand_factor<N>, ...) and removes the respective overload of make_positional_computation

New functionality

  • Distributed boundaries: timers for pack/unpack, exchange, and boundary condition.

New example

  • Tridiagonal solver

Bug fixes

  • Fix CUDA type unsigned long long char, which was a copy and paste bug from the CUDA programming guide where they are missing a comma.
  • Add != to halo_descriptors (== already existed).
  • fortranarrayadapter: Throw if datastore was not allocated.
  • c_bindings: wrap line for procedures.
  • repository: bindings support to add a prefix.
  • In CUDA temporaries are only allocated if they are not cached.
  • User-friendly error on missing backend in make_computation.
  • User-friendly error argument type check of make_multistage.
  • Added Back checkgridagainst_extents
  • communication: only exchange the part of the buffer which is actually used by the exchange (not the full allocated buffer)
  • Workaround nvcc which has problems in unrolling a loop in hypercube_iterator.
  • Fix to the pointer sharing constructor of storage_info.

Other changes

  • Documentation was updated

Internal changes

  • Added hymap which is a boost::fusion-like map.
  • Updates to sid

- C++
Published by havogt about 7 years ago

gridtools - New versioning scheme

Starting with this release we introduce a new versioning scheme.

Changes since 1.08.02 (which would have been 0.18.2 in the new versioning scheme).

New versioning scheme

Version number: X.Y.Z - X: Major version will be 0 until the public release, then it will be 1, probably until a new major feature, e.g. complete icgrid. - Y: Minor version will be increased after every API change and new smaller features, probably very often. - Z: Patch version will be increased for bug fixes. The CMake version matching is changed in this release to COMPATIBILITY SameMinorVersion which means the following: Let's say the user requires find_package(GridTools 0.18.2). Then 0.18.3 (a newer patch release) will be compatible; 0.18.1 (an older than requested release) and 0.19.0 (a newer minor release) will be rejected.

API breaking changes

Removes reduction support from the stencil-composition API - make_reduction is removed - computation type erasure doesn't have ReturnType as a first template argument, i.e. computation<void, args...> needs to be replaced by computation<args...>. - run method of computation returns void now.

New functionality

Possibility to query intent and extent for placeholders from computation - computation.get_arg_intent(my_arg()) returns enumtype::intent - computation.get_arg_extent(my_arg()) returns rt_extent which contains extents in i,j,k directions

Performance improvements

  • several unneeded cudaDeviceSynchronize() in boundary_conditions are removed

Bug fixes

  • c_bindings: support for multiple template arguments in generic bindings macro

Internal changes

  • added convenience library for integral constants with __host__ __device__ conversion and construction with custom literal _c
  • SID utilities

- C++
Published by havogt over 7 years ago