Recent Releases of dace

dace - v1.0.2

This release contains backports of further minor fixes.

What's Changed

  • Fix typos (backport) by @romanc in https://github.com/spcl/dace/pull/1918
  • Fix typo by @romanc in https://github.com/spcl/dace/pull/1945
  • Fix: DDE removing read from access_set in read/write nodes by @romanc in https://github.com/spcl/dace/pull/1955
  • StateFusion misses read-write conflict due to early return by @FlorianDeconinck in https://github.com/spcl/dace/pull/1954

Full Changelog: https://github.com/spcl/dace/compare/v1.0.1...v1.0.2

- Python
Published by tbennun 10 months ago

dace - v1.0.1

This release contains backports of minor fixes following the release of v1.0.0.

Full Changelog: https://github.com/spcl/dace/compare/v1.0.0...v1.0.1

- Python
Published by tbennun 12 months ago

dace - v1.0.0

We are happy to announce DaCe version 1.0!

It is a major release milestone, and we went over many of the known issues over the years to ensure that this is the most stable version we can release without making fundamental changes to the framework. The Stateful DataFlow multiGraph (SDFG) intermediate representation used in this version is faithful to the original paper, which was published in 2019.

On a fundamental level, this release is no different from a minor version release (this version could have been DaCe 0.17), so there are no breaking changes from v0.x.

We would like to thank everyone who contributed to DaCe over the years and helped reach this milestone! It would not have been possible without you.

Release Notes

In addition to many issues and bugfixes courtesy of @acalotoiu, @tim0s, @htorst, @tbennun, @phschaad, @BenWeber42, @philip-paul-mueller, @luigifusco, @ThrudPrimrose, @FlorianDeconinck, @pratyai, @edopao, @kotsaloscv, and @iBug, several new features for quality of life and future development were added.

New features introduced into the SDFG IR and builder API:

  • Add GUIDs to SDFG elements and SDFG diff support (by @phschaad)
  • Added can_be_applied_to() to Transformation API (by @philip-paul-mueller)
  • SDFG.auto_optimize, SDFG.regenerate_code, and SDFG.as_schedule_tree are now easily accessible as API methods and fields

New Python frontend features

  • You can now specify the storage location of expressions inline using the @ operator or type hints. Examples:
    • a = np.ones(M) @ dace.StorageType.CPU_ThreadLocal
    • b: dace.float64[M, N] @ dace.StorageType.GPU_Global = np.zeros(...)

New transformations

  • WCRToAugAssign transformation (by @alexnick83)

New code generation features

  • clang-format can now be configured to be called on generated code (by @ThrudPrimrose)

Experimental features

  • Control flow (loop, conditional, named) regions (by @phschaad and @luca-patrignani). Stay tuned for more updates in the next development releases!

Other changes and bugfix highlights

  • Support for SymPy 1.13 (by @BenWeber42)
  • Rename misleading topologicalsort to bfsnodes by @BenWeber42 in https://github.com/spcl/dace/pull/1590
  • Add multidimensional maps to GPU docs by @tbennun in https://github.com/spcl/dace/pull/1608
  • Improve SDFG work-depth analysis and add SDFG simulated operational intensity analysis by @phschaad in https://github.com/spcl/dace/pull/1607
  • Scalar return values are now disallowed by @philip-paul-mueller in https://github.com/spcl/dace/pull/1609
  • Fixed RedundantArray's handling of "reshaping" Memlets by @philip-paul-mueller in https://github.com/spcl/dace/pull/1603
  • Loop Region Code Generation by @phschaad in https://github.com/spcl/dace/pull/1597
  • Bump certifi from 2023.7.22 to 2024.7.4 by @dependabot in https://github.com/spcl/dace/pull/1614
  • Fix incorrect input/output of nested dace programs by @phschaad in https://github.com/spcl/dace/pull/1615
  • Return correct state in nest_sdfg_subgraph by @tbennun in https://github.com/spcl/dace/pull/1627
  • Made TransientReuse Less Verbose by @philip-paul-mueller in https://github.com/spcl/dace/pull/1622
  • Improving the Usage of #pragma unroll by @philip-paul-mueller in https://github.com/spcl/dace/pull/1621
  • Added PatternNode to dace.transformation imports. by @philip-paul-mueller in https://github.com/spcl/dace/pull/1618
  • Implement user regions and function call regions by @luca-patrignani in https://github.com/spcl/dace/pull/1623
  • Add UUIDs to SDFG elements by @phschaad in https://github.com/spcl/dace/pull/1631
  • framecode: Fix missing BasicCFBlock argument by @iBug in https://github.com/spcl/dace/pull/1630
  • Specified behaviour of Subset.covers() for different dimensionality by @philip-paul-mueller in https://github.com/spcl/dace/pull/1637
  • More robust loop detection by @tbennun in https://github.com/spcl/dace/pull/1646
  • Fix missed exploration of edges in constant propagation by @luigifusco in https://github.com/spcl/dace/pull/1635
  • Fix infinite loop with control flow blocks by @tbennun in https://github.com/spcl/dace/pull/1634
  • Print out exception on parsing fail early by @FlorianDeconinck in https://github.com/spcl/dace/pull/1651
  • Reworked Optional Serializing by @philip-paul-mueller in https://github.com/spcl/dace/pull/1647
  • Modified SetProperty by @philip-paul-mueller in https://github.com/spcl/dace/pull/1653
  • Made CompiledSDFG in the main namespace available. by @philip-paul-mueller in https://github.com/spcl/dace/pull/1567
  • SDFG Diff Tool by @phschaad in https://github.com/spcl/dace/pull/1632
  • Made the SDFGState.add_mapped_tasklet() more convenient by @philip-paul-mueller in https://github.com/spcl/dace/pull/1655
  • Maps With Zero Parameters by @philip-paul-mueller in https://github.com/spcl/dace/pull/1649
  • Bug in constant propagation with multiple constants by @tbennun in https://github.com/spcl/dace/pull/1658
  • Fixed PruneConnectors by @philip-paul-mueller in https://github.com/spcl/dace/pull/1660
  • Fix array indirection to memlet subset promotion by @BenWeber42 in https://github.com/spcl/dace/pull/1406
  • Renamed graph.bfs_edges to edge_bfs by @BenWeber42 in https://github.com/spcl/dace/pull/1604
  • Inter-state edge assignment race test by @tbennun in https://github.com/spcl/dace/pull/1672
  • Fix race conditions in Constant Propagation and Reference-To-View by @tbennun in https://github.com/spcl/dace/pull/1679
  • Improve memlet label and string initialization by @tbennun in https://github.com/spcl/dace/pull/1680
  • Control Flow Raising by @phschaad in https://github.com/spcl/dace/pull/1657
  • Updated InlineMultistateSDFG by @philip-paul-mueller in https://github.com/spcl/dace/pull/1689
  • Extend TrivialTaskletElimination for map scope by @edopao in https://github.com/spcl/dace/pull/1650
  • Fix to Read and Write Sets by @philip-paul-mueller in https://github.com/spcl/dace/pull/1678
  • Make is_empty() and propagate_subset() not unnecessarily rely on the src and dst by @pratyai in https://github.com/spcl/dace/pull/1699
  • fix(codegen/prettycode): Use base_indentation as intended by @iBug in https://github.com/spcl/dace/pull/1697
  • Warn on potential data races by @phschaad in https://github.com/spcl/dace/pull/1712
  • Python frontend stability and inline storage specification by @tbennun in https://github.com/spcl/dace/pull/1711
  • infersymbolsfrom_datadescriptor : modification to infer offset by @kotsaloscv in https://github.com/spcl/dace/pull/1525
  • Add CFG to generate_scope in tutorials by @ThrudPrimrose in https://github.com/spcl/dace/pull/1706
  • Better CopyToMap by @philip-paul-mueller in https://github.com/spcl/dace/pull/1675
  • More NumPy operation implementations by @tbennun in https://github.com/spcl/dace/pull/1498
  • Fix jupyter's version of SDFV by @phschaad in https://github.com/spcl/dace/pull/1714
  • Fix broken codegen tutorial by @romanc in https://github.com/spcl/dace/pull/1720
  • CI: Update checkout and setup-python actions by @romanc in https://github.com/spcl/dace/pull/1718
  • Bump version and update dependencies by @tbennun in https://github.com/spcl/dace/pull/1722
  • Various Cutout Fixes by @phschaad in https://github.com/spcl/dace/pull/1662
  • Various stability improvements and convenience APIs by @tbennun in https://github.com/spcl/dace/pull/1724
  • Rename FORTRAN frontend tests by @pratyai in https://github.com/spcl/dace/pull/1729
  • Add back clang-format support by @ThrudPrimrose in https://github.com/spcl/dace/pull/1732
  • Fix problem with struct reads on interstate edges by @phschaad in https://github.com/spcl/dace/pull/1512
  • Quality of life: Improved error messages by @romanc in https://github.com/spcl/dace/pull/1731
  • Cherry-picked a handful of intrinsic related commits out of multi_sdfg branch. by @pratyai in https://github.com/spcl/dace/pull/1728
  • Used valid FORTRAN test program for a couple frontend tests + Made floatlit2string() convert the FORTRAN real literal strings into python floats. by @pratyai in https://github.com/spcl/dace/pull/1733
  • Fix pure reduce expansion for squeezed output memlets. by @pratyai in https://github.com/spcl/dace/pull/1709
  • Make the import of typing.Literal portable between python versions 3.7 and 3.12 by @pratyai in https://github.com/spcl/dace/pull/1700
  • Fix type inference and code generation for typeclasses and numpy types by @tbennun in https://github.com/spcl/dace/pull/1725
  • SDFG API additions for version 1.0 by @tbennun in https://github.com/spcl/dace/pull/1740
  • Replace another FORTRAN test program with gfortran -Wall certified test program. by @pratyai in https://github.com/spcl/dace/pull/1736
  • Unskip unit tests and provide reasons for skipped tests by @tbennun in https://github.com/spcl/dace/pull/1742
  • Fix OpenMP dynamic loop bounds that use persistent memory by @tbennun in https://github.com/spcl/dace/pull/1746
  • Fixes for SDFGState._read_and_write_sets() by @philip-paul-mueller in https://github.com/spcl/dace/pull/1747
  • Fix temporary transient counter during Python parsing of nested calls by @tbennun in https://github.com/spcl/dace/pull/1745
  • Fix pystr_to_symbolic not correctly interpreting constants as boolean values in boolean comparisons by @phschaad in https://github.com/spcl/dace/pull/1756
  • Fixed dace::math::pi and dace::math::nan on GPU by @philip-paul-mueller in https://github.com/spcl/dace/pull/1759
  • Make scalar to symbol promotion robust to node order in state by @tbennun in https://github.com/spcl/dace/pull/1766

Full Changelog: https://github.com/spcl/dace/compare/v0.16.1...v1.0.0

- Python
Published by tbennun about 1 year ago

dace - v1.0.0rc1

We are happy to announce the first release candidate of DaCe version 1.0!

This version uses the SDFG intermediate representation as published in the original Stateful Dataflow Multigraphs paper, which has been stable for quite some time.

On a fundamental level, this release is no different from a minor version release (this version could have been DaCe 0.17). However, with this release we would like to emphasize stability rather than new features.

If you are using DaCe and have a critical or blocking issue that makes it unstable, please create an issue and refer to it in the release discussion, so that we can add it to our release plan. Thank you for using DaCe!

Release Notes

New features: * Add GUIDs to SDFG elements and SDFG diff support (by @phschaad) * Added can_be_applied_to() to Transformation API (by @philip-paul-mueller) * Support SymPy 1.13 (by @BenWeber42) * New WCRToAugAssign transformation (by @alexnick83) * (Experimental) Control flow (loop, conditional, named) regions (by @phschaad and @luca-patrignani). Stay tuned for more updates in the next development releases!

Bugfixes: * Inter-state edge assignment race condition test in validation (by @tbennun) * Improve memlet label and string initialization (by @tbennun, @philip-paul-mueller) * Minor updates to documentation and internal APIs (by @tbennun, @phschaad, @philip-paul-mueller, @BenWeber42) * Minor fixes to the following transformations and passes: RedundantArray, TransientReuse, DetectLoop, ConstantPropagation, PruneConnectors (by @philip-paul-mueller, @tbennun, @luigifusco) * Minor frontend improvements (by @FlorianDeconinck, @BenWeber42) * Minor improvements to the code generator (by @iBug, @philip-paul-mueller)

See Full Changelog: https://github.com/spcl/dace/compare/v0.16.1...v1.0.0rc1

New Contributors

  • @iBug made their first contribution in https://github.com/spcl/dace/pull/1630
  • @luigifusco made their first contribution in https://github.com/spcl/dace/pull/1635

- Python
Published by tbennun about 1 year ago

dace - v0.16.1

What's Changed

The main purpose of this release is to require NumPy < 2 for DaCe, since NumPy 2.0.0 contains breaking changes which aren't compatible with DaCe currently.

Recently, NumPy 2.0.0 has been released: https://numpy.org/news/#numpy-200-released

The release comes with documented breaking changes. Unfortunately, DaCe is currently not compatible with these changes. This also affects the recent 0.16 release of DaCe. Hence, we adjust our dependency requirements to use NumPy < 2 as a temporary work-around in this PR:

Fix numpy version to < 2.0 by @phschaad in https://github.com/spcl/dace/pull/1601

Long term, we are tracking adding support for NumPy 2 in DaCe in this issue: https://github.com/spcl/dace/issues/1602

Fix constant propagation failing due to invalid topological sort by @phschaad in https://github.com/spcl/dace/pull/1589

This changeset has also landed in DaCe's development branch earlier. It fixes an issue where the ConstantPropagation pass can fail for certain graph structures.

Full Changelog: https://github.com/spcl/dace/compare/v0.16...v0.16.1

- Python
Published by BenWeber42 over 1 year ago

dace - v0.16

What's Changed

CI/CD pipeline for NOAA & NASA weather and climate model by @FlorianDeconinck & @BenWeber42 in https://github.com/spcl/dace/pull/1460, https://github.com/spcl/dace/pull/1478 & https://github.com/spcl/dace/pull/1575

Our collaborators NOAA & NASA have successfully used DaCe as an optimization framework and back-end for some of the components of their climate and weather model. Particularly, the FV3 dycore and GFS physics parametrization have been ported to a combination of GT4Py Python DSL and DaCe. DaCe is used within their stack as a stencil backend and as a full-program optimizer integrating stencils and glue-code together.

With this CI/CD pipeline, we run various checks for those components on every change to DaCe. This is an important step for DaCe to ensure stability for real-world applications that utilize DaCe. We are very grateful for this contribution and the collaboration with NOAA & NASA.

Changed default of serializeallfields to False by @BenWeber42 in https://github.com/spcl/dace/pull/1564

This feature was already implemented in the previous 0.15.1 release in https://github.com/spcl/dace/pull/1452, but not enabled by default. In this release, we are changing the default so that only fields with non-default values are serialized. This generally leads to a reduction in file size for SDFGs.

Since each DaCe version stores the default values of each field, it is still possible to recover these missing values. Default values should rarely change across different DaCe versions. Nevertheless, we want to caution users & developers when using SDFG files with different DaCe versions.

Analysis passes for access range analysis by @tbennun in https://github.com/spcl/dace/pull/1484

Adds two analysis passes to help with analyzing data access sets: access ranges and Reference sources. To enable constructing sets of memlets, this PR also reintroduces data descriptor names to memlet hashes.

Reference-to-View pass and comprehensive reference test suite by @tbennun in https://github.com/spcl/dace/pull/1485

Implements a reference-to-view pass (converting references to views if they are only set to one particular subset). Also improves the simplify pipeline in the presence of Reference data descriptors and adds multiple tests that use references.

Ndarray strides by @alexnick83 in https://github.com/spcl/dace/pull/1506

The PR adds support for custom strides to dace.ndarray. Furthermore, the stride unit is number of elements, in contrast to NumPy/CuPy, where it is number of bytes. Custom strides are not supported for numpy.ndarray and cupy.ndarray.

Structure Support to NestedSDFGs and Python Frontend by @alexnick83 in https://github.com/spcl/dace/pull/1366

Adds basic support for nested data (Structures) to the Python frontend. It also resolves issues with the use of Structures in nested SDFG scopes (mostly code generation).

Generalize StructArrays to ContainerArrays and refactor View class structure by @tbennun in https://github.com/spcl/dace/pull/1504

This PR enables the use of an array data descriptor that contains a nested data descriptor (e.g., ContainerArray of Arrays). Its contents can then be viewed normally with View or StructureView. With this, concepts such as jagged arrays are natively supported in DaCe (see test for example). Also adds support for using ctypes pointers and arrays as arguments to SDFGs.

This PR also refactors the notion of views to a View interface, and provides views to arrays, structures, and container arrays. It also adds a syntactic-sugar/helper API to define a view of an existing data descriptor.

Add support for distributed compilation in DaceProgram by @kotsaloscv in https://github.com/spcl/dace/pull/1551 & https://github.com/spcl/dace/pull/1555

Adds configurable support for distributed compilation (MPI) to the Python front-end (via mpi4py). Distributed compilation can be enabled with the distributed_compilation parameter in the dace.program decorator.

Fixes and other improvements:

  • Remove unused deps by @jack-mcivor in https://github.com/spcl/dace/pull/1459
  • Small fix for debuginfo that can be None by @kotsaloscv in https://github.com/spcl/dace/pull/1469
  • Make dynamic map range docs more explicit by @tbennun in https://github.com/spcl/dace/pull/1474
  • Added nan to the DaCe math namespace by @philip-paul-mueller in https://github.com/spcl/dace/pull/1437
  • Fix for floordiv on GPU target by @edopao in https://github.com/spcl/dace/pull/1471
  • Add merge_group to CI for merge queues by @tbennun in https://github.com/spcl/dace/pull/1482
  • Fix SymPy dependency (again) by @tbennun in https://github.com/spcl/dace/pull/1483
  • Fix for CUDA codegen by @edopao in https://github.com/spcl/dace/pull/1442
  • Complete coverage for reference-to-view pass by @tbennun in https://github.com/spcl/dace/pull/1488
  • CMakeLists.txt Improvements for CUDA by @kylosus in https://github.com/spcl/dace/pull/1337
  • Faster Call for CompiledSDFG by @philip-paul-mueller in https://github.com/spcl/dace/pull/1467
  • Evaluate dtypetotypeclass at use time by @tbennun in https://github.com/spcl/dace/pull/1494
  • Fix redefinition of interstate edge type in code generator by @tbennun in https://github.com/spcl/dace/pull/1490
  • CuPy fixes and special cases for HIP by @tbennun in https://github.com/spcl/dace/pull/1492
  • CI Update by @tim0s in https://github.com/spcl/dace/pull/1502
  • FPGA CI Update by @tim0s in https://github.com/spcl/dace/pull/1508
  • Bump jinja2 from 3.1.2 to 3.1.3 by @dependabot in https://github.com/spcl/dace/pull/1503
  • Jupyter fix by @phschaad in https://github.com/spcl/dace/pull/1489
  • Modernize HIP CMake commands, fix corner cases by @tbennun in https://github.com/spcl/dace/pull/1518
  • Remove the long-deprecated symbol.get/set methods by @tbennun in https://github.com/spcl/dace/pull/1523
  • Support output indirection in numpy frontend by @tbennun in https://github.com/spcl/dace/pull/1509
  • Fix for const references by @alexnick83 in https://github.com/spcl/dace/pull/1522
  • DeadDataFlowElimination will add type hint when removing a connector by @luca-patrignani in https://github.com/spcl/dace/pull/1499
  • Fixed an issue in the Memlet duplication verification. by @philip-paul-mueller in https://github.com/spcl/dace/pull/1526
  • Refactor SDFG List to CFG List by @phschaad in https://github.com/spcl/dace/pull/1511
  • Dependency Edge Hotfix by @Berke-Ates in https://github.com/spcl/dace/pull/1513
  • Remove Property.fromstring and Property.tostring by @luca-patrignani in https://github.com/spcl/dace/pull/1529
  • Fixed the {in,out}_edges() function of the DiGraph class. by @philip-paul-mueller in https://github.com/spcl/dace/pull/1527
  • Fixes for structures nested in (nested) struct-arrays by @alexnick83 in https://github.com/spcl/dace/pull/1534
  • Updated and fixed the MapExpansion transformation. by @philip-paul-mueller in https://github.com/spcl/dace/pull/1532
  • Updated and fixed the MapDimShuffle tranformation. by @philip-paul-mueller in https://github.com/spcl/dace/pull/1531
  • Use State Fissioning to Generalize Transformations by @lukastruemper in https://github.com/spcl/dace/pull/1462
  • Fixed edge consolidation by @philip-paul-mueller in https://github.com/spcl/dace/pull/1546
  • Fix Profiler + Minor improvements by @JanKleine in https://github.com/spcl/dace/pull/1548
  • Add dtype for numpy.uintp which is compatible with C uintptr_t by @kotsaloscv in https://github.com/spcl/dace/pull/1544
  • Fix bug in map_fusion transformation by @edopao in https://github.com/spcl/dace/pull/1553
  • Updated the add_state_{after, before}() function. by @philip-paul-mueller in https://github.com/spcl/dace/pull/1556
  • Bump idna from 3.4 to 3.7 by @dependabot in https://github.com/spcl/dace/pull/1557
  • Fix infinite loops in memlet path when a scope cycle is added by @tbennun in https://github.com/spcl/dace/pull/1559
  • Adds support for ArrayView to the Python Frontend by @alexnick83 in https://github.com/spcl/dace/pull/1565
  • It is now possible to suppress output in view() by @philip-paul-mueller in https://github.com/spcl/dace/pull/1566
  • Bump jinja2 from 3.1.3 to 3.1.4 by @dependabot in https://github.com/spcl/dace/pull/1569
  • Correction in the docstring of the SDFG class's init method by @alexnick83 in https://github.com/spcl/dace/pull/1571
  • Fix Subscript literal evaluation for List by @FlorianDeconinck in https://github.com/spcl/dace/pull/1570
  • SDFG.save() now performs tilde expansion. by @philip-paul-mueller in https://github.com/spcl/dace/pull/1578
  • Control Flow Block Constraints by @phschaad in https://github.com/spcl/dace/pull/1476
  • Updated SDFV and Corresponding HTML Template by @phschaad in https://github.com/spcl/dace/pull/1580
  • Changed Xilinx C++11 flag to C++14 by @BenWeber42 in https://github.com/spcl/dace/pull/1585
  • Made dace::math::pow forward to std::pow more generic by @Berke-Ates @philip-paul-mueller @phschaad @BenWeber42 in https://github.com/spcl/dace/pull/1580

New Contributors

  • @jack-mcivor made their first contribution in https://github.com/spcl/dace/pull/1459
  • @kylosus made their first contribution in https://github.com/spcl/dace/pull/1337
  • @luca-patrignani made their first contribution in https://github.com/spcl/dace/pull/1499

Full Changelog: https://github.com/spcl/dace/compare/v0.15.1...v0.16

- Python
Published by BenWeber42 over 1 year ago

dace - v0.15.1

What's Changed

Highlights

  • Option for utilizing GPU global memory by @alexnick83 in https://github.com/spcl/dace/pull/1405
  • Add tensor storage format abstraction by @JanKleine in https://github.com/spcl/dace/pull/1392
  • Hierarchical Control Flow / Control Flow Regions by @phschaad in https://github.com/spcl/dace/pull/1404
  • GPU code generation: User-specified block/thread/warp location by @tbennun in https://github.com/spcl/dace/pull/1358
  • Implement loop-based Fortran intrinsics by @mcopik in https://github.com/spcl/dace/pull/1394
  • Change strides move assignment outside if by @Sajohn-CH in https://github.com/spcl/dace/pull/1402
  • Numpy fill accepts also variables by @philip-paul-mueller in https://github.com/spcl/dace/pull/1420
  • Implement writeset underapproximation by @matteonu in https://github.com/spcl/dace/pull/1425
  • Loop Regions by @phschaad in https://github.com/spcl/dace/pull/1407
  • Compress the SDFG generated when failing/invalid for larger codebase by @FlorianDeconinck in https://github.com/spcl/dace/pull/1456
  • Do not serialize non-default fields by default by @tbennun in https://github.com/spcl/dace/pull/1452

Fixes and other improvements:

  • replace |& which is not widely supported by @tim0s in https://github.com/spcl/dace/pull/1399
  • RTL codegen "line" error by @carljohnsen in https://github.com/spcl/dace/pull/1403
  • Bump urllib3 from 2.0.6 to 2.0.7 by @dependabot in https://github.com/spcl/dace/pull/1400
  • Bugfixes and extended testing for Fortran SUM by @mcopik in https://github.com/spcl/dace/pull/1390
  • Remove erroneous file creation in test by @JanKleine in https://github.com/spcl/dace/pull/1411
  • Fix for VS Code debug console: view opens sdfg in VS Code and not in browser by @kotsaloscv in https://github.com/spcl/dace/pull/1419
  • Bump werkzeug from 2.3.5 to 3.0.1 by @dependabot in https://github.com/spcl/dace/pull/1409
  • AugAssignToWCR: Support for more cases and increased test coverage by @lukastruemper in https://github.com/spcl/dace/pull/1359
  • Implement Subsetlist and covers_precise by @matteonu in https://github.com/spcl/dace/pull/1412
  • OTFMapFusion: Bugfix for tasklets with None connectors by @lukastruemper in https://github.com/spcl/dace/pull/1415
  • Better mangeling of the state struct in the code generator by @philip-paul-mueller in https://github.com/spcl/dace/pull/1413
  • Trivial map elimination init by @Sajohn-CH in https://github.com/spcl/dace/pull/1353
  • Fixed Improper Method Call: Replaced mktemp by @fazledyn-or in https://github.com/spcl/dace/pull/1428
  • Symbol specialization in auto_optimizer() never took effect. by @philip-paul-mueller in https://github.com/spcl/dace/pull/1410
  • Issue a warning when to_sdfg() ignores the auto_optimize flag (Issue #1380). by @philip-paul-mueller in https://github.com/spcl/dace/pull/1395
  • Fix schedule tree conversion for use of arrays in conditions by @tbennun in https://github.com/spcl/dace/pull/1440
  • Fixes for TaskletFusion, AugAssignToWCR and MapExpansion by @lukastruemper in https://github.com/spcl/dace/pull/1432
  • AugAssignToWCR: Minor fix for node not found error by @lukastruemper in https://github.com/spcl/dace/pull/1447
  • OTFMapFusion: Minor bug fixes by @lukastruemper in https://github.com/spcl/dace/pull/1448
  • Fix three issues related to deepcopying elements by @tbennun in https://github.com/spcl/dace/pull/1446
  • Fix CUDA high-dimensional test by @tbennun in https://github.com/spcl/dace/pull/1441
  • SDFG.arg_names was not a member but a class variable. by @philip-paul-mueller in https://github.com/spcl/dace/pull/1457
  • PruneConnectors: Fission into separate states before pruning by @lukastruemper in https://github.com/spcl/dace/pull/1451
  • In-out connector's global source when connector becomes out-only at outer SDFG scopes. by @alexnick83 in https://github.com/spcl/dace/pull/1463
  • Fix two regressions in v0.15 by @tbennun in https://github.com/spcl/dace/pull/1465
  • Fix codegen with data access on inter-state edge by @edopao in https://github.com/spcl/dace/pull/1434

New Contributors

  • @kotsaloscv made their first contribution in https://github.com/spcl/dace/pull/1419
  • @matteonu made their first contribution in https://github.com/spcl/dace/pull/1412
  • @philip-paul-mueller made their first contribution in https://github.com/spcl/dace/pull/1413
  • @fazledyn-or made their first contribution in https://github.com/spcl/dace/pull/1428

Full Changelog: https://github.com/spcl/dace/compare/v0.15...v0.15.1rc1

- Python
Published by BenWeber42 about 2 years ago

dace - v0.15

What's Changed

Work-Depth / Average Parallelism Analysis by @hodelcl in #1363 and #1327

A new analysis engine allows SDFGs to be statically analyzed for work and depth / average parallelism. The analysis allows specifying a series of assumptions about symbolic program parameters that can help simplify and improve the analysis results. For an example on how to use the analysis, see the following example:

```Python from dace.sdfg.workdepthanalysis import work_depth

A dictionary mapping each SDFG element to a tuple (work, depth)

workdepthmap = {}

Assumptions about symbolic parameters

assumptions = ['N>5', 'M<200', 'K>N'] workdepth.analyzesdfg(mysdfg, workdepthmap, workdepth.gettaskletworkdepth, assumptions)

A dictionary mapping each SDFG element to its average parallelism

averageparallelismmap = {} workdepth.analyzesdfg(mysdfg, averageparallelismmap, workdepth.gettaskletavgpar, assumptions) ```

Symbol parameter reduction in generated code (#1338, #1344)

To improve our integration with external codes, we limit the symbolic parameters generated by DaCe to only the used symbols. Take the following code for example: python @dace def addone(a: dace.float64[N]): for i in dace.map[0:10]: a[i] += 1 Since the internal code does not actually need N to process the array, it will not appear in the generated code. Before this release the signature of the generated code would be: cpp DACE_EXPORTED void __program_addone(addone_t *__state, double * __restrict__ a, int N);

After this release it is: cpp DACE_EXPORTED void __program_addone(addone_t *__state, double * __restrict__ a);

Note that this is a major, breaking change that requires users who manually interact with the generated .so files to adapt to.

Externally-allocated memory (workspace) support (#1294)

A new allocation lifetime, dace.AllocationLifetime.External, has been introduced into DaCe. Now you can use your DaCe code with external memory allocators (such as PyTorch) and ask DaCe for: (a) how much transient memory it will need; and (b) to use a specific pre-allocated pointer. Example:

```python @dace def some_workspace(a: dace.float64[N]): workspace = dace.ndarray([N], dace.float64, lifetime=dace.AllocationLifetime.External) workspace[:] = a workspace += 1 a[:] = workspace

csdfg = someworkspace.tosdfg().compile()

sizes = csdfg.getworkspacesizes() # Returns {dace.StorageType.CPUHeap: N*8} wsp = # ...Allocate externally... csdfg.setworkspace(dace.StorageType.CPU_Heap, wsp) ```

The same interface is available in the generated code: cpp size_t __dace_get_external_memory_size_CPU_Heap(programname_t *__state, int N); void __dace_set_external_memory_CPU_Heap(programname_t *__state, char *ptr, int N); // or GPU_Global...

Schedule Trees (EXPERIMENTAL, #1145)

An experimental feature that allows you to analyze your SDFGs in a schedule-oriented format. It takes in SDFGs (even after applying transformations) and outputs a tree of elements that can be printed out in a Python-like syntax. For example: ```python @dace.program def matmul(A: dace.float32[10, 10], B: dace.float32[10, 10], C: dace.float32[10, 10]): for i in range(10): for j in dace.map[0:10]: atile = dace.definelocal([10], dace.float32) atile[:] = A[i] for k in range(10): with dace.tasklet: # ... sdfg = matmul.tosdfg()

from dace.sdfg.analysis.scheduletree.sdfgtotree import asscheduletree stree = asscheduletree(sdfg) print(stree.asstring()) will print: python for i = 0; (i < 10); i = i + 1: map j in [0:10]: atile = copy A[i, 0:10] for k = 0; (k < 10); k = (k + 1): C[i, j] = tasklet(atile[k], B(10) [k, j], C[i, j]) ```

There are some new transformation classes and passes in dace.sdfg.analysis.schedule_tree.passes, for example, to remove empty control flow scopes: python class RemoveEmptyScopes(tn.ScheduleNodeTransformer): def visit_scope(self, node: tn.ScheduleTreeScope): if len(node.children) == 0: return None return self.generic_visit(node) We hope you find new ways to analyze and optimize DaCe programs with this feature!

Other Major Changes

  • Support for tensor linear algebra (transpose, dot products) by @alexnick83 in #1309
  • (Experimental) support for nested data containers and structures by @alexnick83 in #1324
  • (Experimental) basic support for mpi4py syntax by @alexnick83 and @Com1t in #1070 and #1288
  • (Experimental) Added support for a subset of F77 and F90 language features by @acalotoiu and @mcopik #1275, #1293, #1349 and #1367

Minor Changes

  • Support for Python 3.12 by @alexnick83 in #1386
  • Support attributes in symbolic expressions by @tbennun in #1369
  • GPU User Experience Improvements by @tbennun in #1283
  • State Fusion Extension with happens before dependency edge by @acalotoiu in #1268
  • Add CPU_Persistent map schedule (OpenMP parallel regions) by @tbennun in #1330

Fixes and Smaller Changes:

  • Fix transient bug in test with array_equal of empty arrays by @tbennun in #1374
  • Fixes GPUTransform bug when data are already in GPU memory by @alexnick83 in #1291
  • Fixed erroneous parsing of data slices when the data are defined inside a nested scope by @alexnick83 in #1287
  • Disable OpenMP sections by default by @tbennun in #1282
  • Make SDFG.name a proper property by @phschaad in #1289
  • Refactor and fix performance regression with GPU runtime checks by @tbennun in #1292
  • Fixed RW dependency violation when accessing data attributes by @alexnick83 in #1296
  • Externally-managed memory lifetime by @tbennun in #1294
  • External interaction fixes by @tbennun in #1301
  • Improvements to RefineNestedAccess by @alexnick83 and @Sajohn-CH in #1310
  • Fixed erroneous parsing of while-loop conditions by @alexnick83 in #1313
  • Improvements to MapFusion when the Map bodies contain NestedSDFGs by @alexnick83 in #1312
  • Fixed erroneous code generation of indirected accesses by @alexnick83 in #1302
  • RefineNestedAccess take indices into account when checking for missing free symbols by @Sajohn-CH in #1317
  • Fixed SubgraphFusion erroneously removing/merging intermediate data nodes by @alexnick83 in #1307
  • Fixed SDFG DFS traversal missing InterstateEdges by @alexnick83 in #1320
  • Frontend now uses the AST nodes' context to infer read/write accesses by @alexnick83 in #1297
  • Added capability for non-strict shape validation by @alexnick83 in #1321
  • Fixes for persistent schedule and GPUPersistentFusion transformation by @tbennun in #1322
  • Relax test for inter-state edges in default schedules by @tbennun in #1326
  • Improvements to inference of an SDFGState's read and write sets by @Sajohn-CH in #1325 and #1329
  • Fixed ArrayElimination pass trying to eliminate data that were already removed in #1314
  • Bump certifi from 2023.5.7 to 2023.7.22 by @dependabot in #1332
  • Fix some underlying issues with tensor core sample by @computablee in #1336
  • Updated hlslib to support Xilinx Vitis >=2022.2 by @carljohnsen in #1340
  • Docs: mention FPGA backend tested with Intel Quartus PRO by @TizianoDeMatteis in #1335
  • Improved validation of NestedSDFG connectors by @alexnick83 in #1333
  • Remove unused global data descriptor shapes from arguments by @tbennun in #1338
  • Fixed Scalar data validation in NestedSDFGs by @alexnick83 in #1341
  • Fix for None set properties by @tbennun in #1345
  • Add Object to defined types in code generation and some documentation by @tbennun in #1343
  • Fix symbolic parsing for ternary operators by @tbennun in #1346
  • Fortran fix memlet indices by @Sajohn-CH in #1342
  • Have memory type as argument for fpga auto interleave by @TizianoDeMatteis in #1352
  • Eliminate extraneous branch-end gotos in code generation by @tbennun in #1355
  • TaskletFusion: Fix additional edges in case of none-connectors by @lukastruemper in #1360
  • Fix dynamic memlet propagation condition by @tbennun in #1364
  • Configurable GPU thread/block index types, minor fixes to integer code generation and GPU runtimes by @tbennun in #1357

New Contributors

  • @computablee made their first contribution in #1290
  • @Com1t made their first contribution in #1288
  • @mcopik made their first contribution in #1349

Full Changelog: https://github.com/spcl/dace/compare/v0.14.4...v0.15

- Python
Published by tbennun about 2 years ago

dace - DaCe 0.14.4

Minor release; adds support for Python 3.11.

- Python
Published by tbennun over 2 years ago

dace - DaCe 0.14.3

What's Changed

Scope Schedules

The schedule type of a scope (e.g., a Map) is now also determined by the surrounding storage. If the surrounding storage is ambiguous, dace will fail with a nice exception. This means that codes such as the one below:

Python @dace.program def add(a: dace.float32[10, 10] @ dace.StorageType.GPU_Global, b: dace.float32[10, 10] @ dace.StorageType.GPU_Global): return a + b @ b

will now automatically run the + and @ operators on the GPU.

(#1262 by @tbennun)

DaCe Profiler

Easier interface for profiling applications: dace.profile and dace.instrument can now be used within Python with a simple API:

```Python with dace.profile(repetitions=100) as profiler: someprogram(...) # ... otherprogram(...)

Print all execution times of the last called program (other_program)

print(profiler.times[-1]) ```

Where instrumentation is applied can be controlled with filters in the form of strings and wildcards, or with a function:

```Python with dace.instrument(dace.InstrumentationType.GPUEvents, filter='*add??') as profiler: someprogram(...) # ... other_program(...)

Print instrumentation report for last call

print(profiler.reports[-1]) ```

With dace.builtin_hooks.instrument_data, the same technique can be applied to instrument data containers.

(#1197 by @tbennun)

Improved Data Instrumentation

Data container instrumentation can further now be used conditionally, allowing saving and restoring of data container contents only if certain conditions are met. In addition to this, data instrumentation now saves the SDFG's symbol values at the time of dumping data, allowing an entire SDFG's state / context to be restored from data reports.

(#1202, #1208 by @phschaad)

Restricted SSA for Scalars and Symbols

Two new passes (ScalarFission and StrictSymbolSSA) allow fissioning of scalar data containers (or arrays of size 1) and symbols into separate containers and symbols respectively, based on the scope or reach of writes to them. This is a form of restricted SSA, which performs SSA wherever possible without introducing Phi-nodes. This change is made possible by a set of new analysis passes that provide the scope or reach of each write to scalars or symbols.

(#1198, #1214 by @phschaad)

Extending Cutout Capabilities

SDFG Cutouts can now be taken from more than one state.

Additionally, taking cutouts that only access a subset of a data containre (e.g., A[2:5] from a data container A of size N) results in the cutout receiving an "Alibi Node" to represent only that subset of the data (A_cutout[0:3] -> A[2:5], where A_cutout is of size 4). This allows cutouts to be significantly smaller and have a smaller memory footprint, simplifying debugging and localized optimization.

Finally, cutouts now contain an exact description of their input and output configuration. The input configuration is anything that may influence a cutout's behavior and may contain data before the cutout is executed in the context of the original SDFG. Similarly, the output configuration is anything that a cutout writes to, that may be read externally or may influence the behavior of the remaining SDFG. This allows isolating all side effects of changes to a particular cutout, allowing transformations to be tested and verified in isolation and simplifying debugging.

(#1201 by @phschaad)

Bug Fixes, Compatability Improvements, and Other Changes

  • SymPy 1.12 Compatibility by @alexnick83 in https://github.com/spcl/dace/pull/1256
  • GPU Grid-Strided Tiling by @C-TC in https://github.com/spcl/dace/pull/1249
  • Fix MapInterchange for Maps with dynamic inputs by @alexnick83 in https://github.com/spcl/dace/pull/1244
  • Assortment of fixes for dynamic Maps on GPU (dynamic thread blocks) by @alexnick83 in https://github.com/spcl/dace/pull/1246
  • Tuning Compatibility Fixes by @lukastruemper in https://github.com/spcl/dace/pull/1234
  • Inline preprocessor command by @tbennun in https://github.com/spcl/dace/pull/1242
  • unsqueeze_memlet fixes by @alexnick83 in https://github.com/spcl/dace/pull/1203
  • Fix-intermediate-nodes by @alexnick83 in https://github.com/spcl/dace/pull/1212
  • Fix for LoopToMap when applied on multi-nested loops by @alexnick83 in https://github.com/spcl/dace/pull/1207
  • Fix-nested-sdfg-deepcopy by @alexnick83 in https://github.com/spcl/dace/pull/1221
  • Fix integer division in Python frontend by @tbennun in https://github.com/spcl/dace/pull/1196
  • Fix augmented assignment on scalar in condition by @tbennun in https://github.com/spcl/dace/pull/1225
  • Fix internal subscript access if already existed by @tbennun in https://github.com/spcl/dace/pull/1228
  • Fix atomic operation detection for exactly-overlapping ranges by @tbennun in https://github.com/spcl/dace/pull/1230
  • Fix-gpu-transform-copy-out by @alexnick83 in https://github.com/spcl/dace/pull/1231
  • Fix-interstate-free-symbols by @alexnick83 in https://github.com/spcl/dace/pull/1238
  • Fix nested access with nested symbol dependency by @alexnick83 in https://github.com/spcl/dace/pull/1239
  • Fix import in the transformations tutorial. by @lamyiowce in https://github.com/spcl/dace/pull/1210
  • LoopToMap detects shared transients by @alexnick83 in https://github.com/spcl/dace/pull/1200
  • Faster CI and reachability checks for codecov.io by @tbennun in https://github.com/spcl/dace/pull/1213
  • Map-fission-single-data-multi-connectors by @alexnick83 in https://github.com/spcl/dace/pull/1216
  • Add library path to HIP CMake by @tbennun in https://github.com/spcl/dace/pull/1219
  • BatchedMatMul: MKL gemm_batch support by @lukastruemper in https://github.com/spcl/dace/pull/1181

Full Changelog: https://github.com/spcl/dace/compare/v0.14.2...v0.14.3

Please let us know if there are any regressions with this new release.

- Python
Published by phschaad over 2 years ago

dace - DaCe 0.14.2

What's Changed

  • GPU instrumentation support with LIKWID by @lukastruemper
  • New GPU expansion for the Reduce Library Node by @hodelcl
  • CSRMM and CSRMV Library Nodes by @alexnick83, @lukastruemper, and @C-TC
  • New transformations (Temporal Vectorization, HBM Transform) and other FPGA improvements by @carljohnsen, @jnice-81, @sarahtr, and @TizianoDeMatteis
  • AMD GPU-related fixes and rocBLAS GEMM by @tbennun

Full Changelog: https://github.com/spcl/dace/compare/v0.14.1...v0.14.2

- Python
Published by phschaad almost 3 years ago

dace - DaCe 0.14.1

This release of DaCe offers mostly stability fixes for the Python frontend, transformations, and callbacks.

Full Changelog: https://github.com/spcl/dace/compare/v0.14...v0.14.1

- Python
Published by tbennun about 3 years ago

dace - DaCe 0.14

What's Changed

This release brings forth a major change to how SDFGs are simplified in DaCe, using the Simplify pass pipeline. This both improves the performance of DaCe's transformations and introduces new types of simplification, such as dead dataflow elimination.

Please let us know if there are any regressions with this new release.

Features

  • Breaking change: The experimental dace.constant type hint has now achieved stable status and was renamed to dace.compiletime
  • Major change: Only modified configuration entries are now stored in ~/.dace.conf. The SDFG build folders still include the full configuration file. Old .dace.conf files are detected and migrated automatically.
  • Detailed, multi-platform performance counters are now available via native LIKWID instrumentation (by @lukastruemper in https://github.com/spcl/dace/pull/1063). To use, set .instrument to dace.InstrumentationType.LIKWID_Counters
  • GPU Memory Pools are now supported through CUDA's mallocAsync API. To enable, set desc.pool = True on any GPU data descriptor.
  • Map schedule and array storage types can now be annotated directly in Python code (by @orausch in https://github.com/spcl/dace/pull/1088). For example: ```python import dace from dace.dtypes import StorageType, ScheduleType

N = dace.symbol('N')

@dace def addongpu(a: dace.float64[N] @ StorageType.GPUGlobal, b: dace.float64[N] @ StorageType.GPUGlobal): # This map will become a GPU kernel for i in dace.map[0:N] @ ScheduleType.GPU_Device: b[i] = a[i] + 1.0 * Customizing GPU block dimension and OpenMP threading properties per map is now supported * Optional arrays (i.e., arrays that can be None) can now be annotated in the code. The simplification pipeline also infers non-optional arrays from their use and can optimize code by eliminating branches. For example: python @dace def optional(maybe: Optional[dace.float64[20]], always: dace.float64[20]): always += 1 # "always" is always used, so it will not be optional if maybe is None: # This condition will stay in the code return 1 if always is None: # This condition will be eliminated in simplify return 2 return 3 ```

Minor changes

  • Miscellaneous fixes to transformations and passes
  • Fixes for string literal ("string") use in the Python frontend
  • einsum is now a library node
  • If CMake is already installed, it is now detected and will not be installed through pip
  • Add kernel detection flag by @TizianoDeMatteis in https://github.com/spcl/dace/pull/1061
  • Better support for __array_interface__ objects by @gronerl in https://github.com/spcl/dace/pull/1071
  • Replacements look up base classes by @tbennun in https://github.com/spcl/dace/pull/1080

Full Changelog: https://github.com/spcl/dace/compare/v0.13.3...v0.14

- Python
Published by tbennun over 3 years ago

dace - DaCe 0.13.3

What's Changed

  • Better integration with Visual Studio Code: Calling sdfg.view() inside a VSCode console or debug session will open the file directly in the editor!
  • Code generator for the Snitch RISC-V architecture (by @noah95 and @am-ivanov)
  • Minor hotfixes to Python frontend, transformations, and code generation (with @orausch)

Full Changelog: https://github.com/spcl/dace/compare/v0.13.2...v0.13.3

- Python
Published by tbennun over 3 years ago

dace - DaCe 0.13.2

What's Changed

  • New API for SDFG manipulation: Passes and Pipelines. More about that in the next major release!
  • Various fixes to frontend, type inference, and code generation.
  • Support for more numpy and Python functions: arange, round, etc.
  • Better callback support:
    • Support callbacks with keyword arguments
    • Support literal lists, tuples, sets, and dictionaries in callbacks
  • New transformations: move loop into map, on-the-fly-recomputation map fusion
  • Performance improvements to frontend
  • Better Docker container compatibility via fixes for config files without a home directory
  • Add interface to check whether in a DaCe parsing context in https://github.com/spcl/dace/pull/998 python def potentially_parsed_by_dace(): if not dace.in_program(): print('Called by Python interpreter!') else: print('Compiled with DaCe!')
  • Support compressed (gzipped) SDFGs. Loads normally, saves with: python sdfg.save('myprogram.sdfgz', compress=True) # or just run gzip on your old SDFGs
  • SDFV: Add web serving capability by @orausch in https://github.com/spcl/dace/pull/1013. Use for interactively debugging SDFGs on remote nodes with: sdfg.view(8080) (or any other port)

Full Changelog: https://github.com/spcl/dace/compare/v0.13.1...v0.13.2

- Python
Published by tbennun over 3 years ago

dace - DaCe 0.13.1

What's Changed

  • Python frontend: Bug fixes for closures and callbacks in nested scopes
  • Bug fixes for several transformations (StateFusion, RedundantSecondArray)
  • Fixes for issues with FORTRAN ordering of numpy arrays
  • Python object duplicate reference checks in SDFG validation

Full Changelog: https://github.com/spcl/dace/compare/v0.13...v0.13.1

- Python
Published by tbennun over 3 years ago

dace - DaCe 0.13

New Features

Cutout:

Cutout allows developers to take large DaCe programs and cut out subgraphs reliably to create a runnable sub-program. This sub-program can be then used to check for correctness, benchmark, and transform a part of a program without having to run the full application. * Example usage from Python: python def my_method(sdfg: dace.SDFG, state: dace.SDFGState): nodes = [n for n in state if isinstance(n, dace.nodes.LibraryNode)] # Cut every library node cut_sdfg: dace.SDFG = cutout.cutout_state(state, *nodes) # The cut SDFG now includes each library node and all the necessary arrays to call it with Also available in the SDFG editor:

Data Instrumentation:

Just like node instrumentation for performance analysis, data instrumentation allows users to set access nodes to be saved to an instrumented data report, and loaded later for exact reproducible runs. * Data instrumentation natively works with CPU and GPU global memory, so there is no need to copy data back * Combined with Cutout, this is a powerful interface to perform local optimizations in large applications with ease! * Example use: ```python @dace.program def tester(A: dace.float64[20, 20]): tmp = A + 1 return tmp + 5

sdfg = tester.to_sdfg()
for node, _ in sdfg.all_nodes_recursive():  # Instrument every access node
    if isinstance(node, nodes.AccessNode):
        node.instrument = dace.DataInstrumentationType.Save

A = np.random.rand(20, 20)
result = sdfg(A)

# Get instrumented data from report
dreport = sdfg.get_instrumented_data()
assert np.allclose(dreport['A'], A)
assert np.allclose(dreport['tmp'], A + 1)
assert np.allclose(dreport['__return'], A + 6)

```

Logical Groups:

SDFG elements can now be grouped by any criteria, and they will be colored during visualization by default (by @phschaad). See example in action:

Changes and Bug Fixes

  • Samples and tutorials have now been updated to reflect the latest API
  • Constants (added with sdfg.add_constant) can now be used as access nodes in SDFGs. The constants are hard-coded into the generated program, so you can run code with the best performance possible.
  • View nodes can now use the views connector to disambiguate which access node is being viewed
  • Python frontend: else clause is now handled in for and while loops
  • Scalars have been removed from the __dace_init generated function signature (by @orausch)
  • Multiple clock signals in the RTL codegen (by @carljohnsen)
  • Various fixes to frontends, transformations, and code generators

Full Changelog available at https://github.com/spcl/dace/compare/v0.12...v0.13

- Python
Published by tbennun almost 4 years ago

dace - DaCe 0.12

API Changes

Important: Pattern-matching transformation API has been significantly simplified. Transformations using the old API must be ported! Summary of changes: * Transformations now expand either the SingleStateTransformation or MultiStateTransformation classes instead of using decorators * Patterns must be registered as class variables called PatternNodes * Nodes in matched patterns can be then accessed in can_be_applied and apply directly using self.nodename * The name strict is now replaced with permissive (False by default). Permissive mode allows transformations to match in more cases, but may be dangerous to apply (e.g., create race conditions). * can_be_applied is now a method of the transformation * The apply method accepts a graph and the SDFG.

Example of using the new API: ```python import dace from dace import nodes from dace.sdfg import utils as sdutil from dace.transformation import transformation as xf

class ExampleTransformation(xf.SingleStateTransformation): # Define pattern nodes map_entry = xf.PatternNode(nodes.MapEntry) access = xf.PatternNode(nodes.AccessNode)

# Define matching subgraphs
@classmethod
def expressions(cls):
    # MapEntry -> Access
    return [sdutil.node_path_graph(cls.map_entry, cls.access)]

def can_be_applied(self, graph: dace.SDFGState, expr_index: int, sdfg: dace.SDFG, permissive: bool = False) -> bool:
    # Returns True if the transformation can be applied on a subgraph
    if permissive:  # In permissive mode, we will always apply this transformation
        return True
    return self.map_entry.schedule == dace.ScheduleType.CPU_Multicore

def apply(self, graph: dace.SDFGState, sdfg: dace.SDFG):
    # Apply the transformation using the SDFG API
    pass

```

Simplifying SDFGs is renamed from sdfg.apply_strict_transformations() to sdfg.simplify()

AccessNodes no longer have an AccessType field.

Other changes

  • More nested SDFG inlining opportunities by default with the multi-state inline transformation
  • Performance optimizations of the DaCe framework (parsing, transformations, code generation) for large graphs
  • Support for Xilinx Vitis 2021.2
  • Minor fixes to transformations and deserialization

Full Changelog: https://github.com/spcl/dace/compare/v0.11.4...v0.12

- Python
Published by tbennun almost 4 years ago

dace - DaCe 0.11.4

What's Changed

  • If a Python call cannot be parsed into a data-centric program, DaCe will automatically generate a callback into Python. Supports CPU arrays and GPU arrays (via CuPy) without copying!
  • Python 3.10 support
  • CuPy arrays are supported when calling @dace.programs in JIT mode
  • Fix various issues in Python frontend and code generation

Full Changelog: https://github.com/spcl/dace/compare/v0.11.3...v0.11.4

- Python
Published by tbennun about 4 years ago

dace - DaCe 0.11.3

What's Changed

  • Minor fixes to exceptions in Python parsing.

Full Changelog: https://github.com/spcl/dace/compare/v0.11.2...v0.11.3

- Python
Published by tbennun about 4 years ago

dace - DaCe 0.11.2

What's Changed

  • Various bug fixes to the Python frontend

Full Changelog: https://github.com/spcl/dace/compare/v0.11.1...v0.11.2

- Python
Published by tbennun about 4 years ago

dace - DaCe 0.11.1

What's Changed

  • More flexible Python frontend: you can now call functions and object methods, use fields and globals in @dace programs! Some examples:
    • There is no need to annotate called functions
    • @dataclass and general object field support
    • Loop unrolling: implicit and explicit (with the dace.unroll generator)
    • Constant folding and explicit constant arguments (with dace.constant as a type hint)
    • Debuggability: all functions (e.g. dace.map, dace.tasklet) work in pure Python as well
    • and many more features
  • NumPy semantics are followed more closely, e.g., subscripts create array views
  • Direct CuPy and torch.tensor integration in @dace program arguments
  • Auto-optimization (preview): use @dace.program(auto_optimize=True, device=dace.DeviceType.CPU) to automatically run some transformations, such as turning loops into parallel maps.
  • ARM SVE code generation support by @sscholbe (#705)
  • Support for MLIR tasklets by @Berke-Ates in (#747)
  • Source Mapping by @benibenj in https://github.com/spcl/dace/pull/756
  • Support for HBM on Xilinx FPGAs by @jnice-81 (#762)

Miscellaneous: * Various performance optimizations to calling @dace programs * Various bug fixes to transformations, code generator, and frontends

Full Changelog: https://github.com/spcl/dace/compare/v0.10.8...v0.11.1

- Python
Published by tbennun about 4 years ago

dace - DaCe 0.10.8

What's New?

  • Various bug fixes and more stable Python/NumPy frontend
  • Support for running DaCe programs within the Python interpreter
  • (experimental) Support for automatic optimization passes (more coming soon!)

- Python
Published by tbennun over 4 years ago

dace - DaCe 0.10

What's New?

  • Python frontend improvements: More Python features are supported, such as return values, tuples, and numpy broadcasting. @dace.programs can now call other programs or SDFGs.
  • AMD GPU (HIP) Support: AMD GPUs are now fully supported with HIP code generation.
  • Easy-to-use transformation APIs: Apply transformation compositions with one call, enumerate subgraph matches manually, and many more functions now available as part of the dace API. See the new tutorial for examples.
  • Faster code generation: Backends now generate lower-level code that is more compiler-friendly.
  • Instrumentation interface: Setting the instrument property for SDFG nodes and states enables easy-to-use, localized performance reporting with timers, GPU events, and PAPI performance counters.
  • DaCe VSCode plugin: Interactive SDFG viewer and optimizer as part of Visual Studio Code. Download the plugin here.
  • Type inference and connector types: In addition to automatic type inference, connectors on nodes can now be defined with explicit types, giving more fine-grained control over type reinterpreting and vector types.
  • Subgraph transformations: New transformation type that can work on arbitrary subgraphs. For example, fuse any computation within a state with SubgraphFusion.
  • Persistent GPU kernel schedule: Launch persistent kernels with a change of a property! Proportion used of GPU multiprocessors is configurable.
  • More transformations: Loop manipulation and other new transformations now available with DaCe. Some transformations (such as Vectorization) made more robust to corner cases.
  • More tools: Use sdfgcc to quickly compile and optimize .sdfg files from the command line, generating header and library files. Great for interoperability and Makefiles.
  • Short DaCe annotation: Data-centric functions can now be annotated with @dace.
  • Many minor fixes and additions: More library nodes (such as einsum) and new properties added, enabling faster performance and more productive high-performance coding than ever.

- Python
Published by tbennun over 5 years ago

dace - DaCe 0.9.5

What's New?

  • Intel FPGA backend: Generates and compiles Intel FPGA OpenCL code from SDFGs.
  • Renderer: Many improvements to the scalability of drawing large SDFGs, touch/mobile support, and code view upon zooming into Tasklets.
  • SDFV: Now includes a sidebar with information about clicked nodes/edges/states.
  • GPU reduction: Now supports Reduce nodes where output array contains multiple dimensions (if contiguous). On other cases, use the ReduceExpansion transformation.
  • Faster compilation: Improved CMake usage to speed up compilation time if files were not changed.
  • Stability: Various fixes to the Python frontend, transformations, code generation, and DIODE (on Linux and Windows).
  • Generated programs now include header (.h) file and an example C program that invokes the compiled SDFG.

- Python
Published by tbennun almost 6 years ago

dace - DaCe 0.9

What's New

  • NumPy syntax for Python: Wrap Python functions that work on numpy arrays with @dace.program and create SDFGs from implicit dataflow.
  • DIODE 2.0: DIODE has been reworked to operate in the browser, and works natively on Windows. Note that it is currently experimental, and some features may cause errors. We are happy to fix bugs if you find and report issues!
  • Standalone SDFG renderer (SDFV) and improved Jupyter support: Contextual, optimized SDFG drawing with collapsible scopes (double-click a map, a state, or a nested SDFG). Fully integrated into Jupyter notebooks.
  • Transformations: Improvements to scalability of subgraph pattern matching and memlet propagation.
  • Improvements to the TensorFlow frontend.
  • Many minor bug fixes and several API improvements.

- Python
Published by tbennun about 6 years ago

dace - DaCe 0.8.1

Initial release of DaCe.

- Python
Published by tbennun over 6 years ago