Recent Releases of heat

heat - Heat v1.5.1 - Support for torch 2.6 and bug fixes

Changes

Compatibility

  • #1775 Support PyTorch 2.6.0 (by @mrfh92)

Bug Fixes

Contributors

@ClaudiaComito, @JuanPedroGHM, @joernhees, @mrfh92, and @mtar

- Python
Published by github-actions[bot] over 1 year ago

heat - Heat 1.5 Release: distributed matrix factorization and more

Heat 1.5 Release Notes


Overview

With Heat 1.5 we release the first set of features developed within the ESAPCA project funded by the European Space Agency (ESA).

The main focus of this release is on distributed linear algebra operations, such as tall-skinny SVD, batch matrix multiplication, and triangular solver. We also introduce vectorization via vmap across MPI processes, and batch-parallel random number generation as default for distributed operations.

This release also includes a new class for distributed Compressed Sparse Column matrices, paving the way for future implementation of distributed sparse matrix multiplication.

On the performance side, our new array redistribution via MPI Custom Datatypes provides significant speed-up in operations that require it, such as FFTs (see Dalcin et al., 2018).

We are grateful to our community of users, students, open-source contributors, the European Space Agency and the Helmholtz Association for their support and feedback.

Highlights

  • [ESAPCA] Distributed tall-skinny SVD: ht.linalg.svd (by @mrfh92)
  • Distributed batch matrix multiplication: ht.linalg.matmul (by @FOsterfeld)
  • Distributed solver for triangular systems: ht.linalg.solve_triangular (by @FOsterfeld)
  • Vectorization across MPI processes: ht.vmap (by @mrfh92)

Other Changes

Performance Improvements

  • #1493 Redistribution speed-up via MPI Custom Datatypes available by default in ht.resplit (by @JuanPedroGHM)

Sparse

  • #1377 New class: Distributed Compressed Sparse Column Matrix ht.sparse.DCSC_matrix() (by @Mystic-Slice)

Signal Processing

  • #1515 Support batch 1-d convolution in ht.signal.convolve (by @ClaudiaComito)

RNG

  • #1508 Introduce batch-parallel RNG as default for distributed operations (by @mrfh92)

Statistics

  • #1420 Support sketched percentile/median for large datasets with ht.percentile(sketched=True) (and ht.median) (by @mrhf92)
  • #1510 Support multiple axes for distributed ht.percentile and ht.median (by @ClaudiaComito)

Manipulations

  • #1419 Implement distributed unfold operation (by @FOsterfeld)

I/O

  • #1602 Improve load balancing when loading .npy files from path (by @Reisii)
  • #1551 Improve load balancing when loading .csv files from path (by @Reisii)

Machine Learning

  • #1593 Improved batch-parallel clustering ht.cluster.BatchParallelKMeans and ht.cluster.BatchParallelKMedians (by @mrfh92)

Deep Learning

  • #1529 Make dataset.ishuffle optional. (by @krajsek)

Other Updates

  • #1618 Support mpi4py 4.x.x (by @JuanPedroGHM)

Contributors

@mrfh92, @FOsterfeld, @JuanPedroGHM, @Mystic-Slice, @ClaudiaComito, @Reisii, @mtar and @krajsek

- Python
Published by ClaudiaComito over 1 year ago

heat - Heat 1.5.0-rc1: Pre-Release

Changes

Cluster

  • #1593 Improved Batch Parallelization. (by @mrfh92)

Data

  • #1529 Make dataset.ishuffle optional. (by @krajsek)

IO

  • #1602 Improved load balancing when loading .npy files from path. (by @Reisii)
  • #1551 Improved load balancing when loading .csv files from path. (by @Reisii)

Linear Algebra

  • #1261 Batched matrix multiplication. (by @FOsterfeld)
  • #1504 Add solver for triangular systems. (by @FOsterfeld)

Manipulations

  • #1419 Implement distributed unfold operation. (by @FOsterfeld)

Random

  • #1508 Introduce Batchparallel for RNG as default. (by @mrfh92)

Signal

  • #1515 Support batch 1-d convolution in ht.signal.convolve. (by @ClaudiaComito)

Statistics

  • #1510 Support multiple axes for ht.percentile. (by @ClaudiaComito)

Sparse

  • #1377 Distributed Compressed Sparse Column Matrix. (by @Mystic-Slice)

Other

  • #1618 Support mpi4py 4.x.x (by @JuanPedroGHM)

Contributors

@ClaudiaComito, @FOsterfeld, @JuanPedroGHM, @Reisii, @mrfh92, @mtar and @krajsek

- Python
Published by github-actions[bot] over 1 year ago

heat - Heat 1.4.2 - Maintenance Release

Changes

Interoperability

  • #1467, #1525 Support PyTorch 2.3.1 (by @mtar)
  • #1535 Address test failures after netCDF4 1.7.1, numpy 2 releases (by @ClaudiaComito)

Contributors

@ClaudiaComito, @mrfh92 and @mtar

- Python
Published by github-actions[bot] almost 2 years ago

heat - Heat 1.4.1: Bug fix release

Changes

Bug fixes

  • #1472 DNDarrays returned by _like functions default to same device as input DNDarray (by @mrfh92, @ClaudiaComito)

Maintenance

  • #1441 added names of non-core members in citation file (by @mrfh92)

Contributors

@ClaudiaComito and @mrfh92

- Python
Published by github-actions[bot] about 2 years ago

heat - Interactive HPC tutorials, distributed FFT, batch-parallel clustering, support PyTorch 2.2.2

Changes

Documentation

  • #1406 New tutorials for interactive parallel mode for both HPC and local usage (by @ClaudiaComito)

๐Ÿ”ฅ Features

  • #1288 Batch-parallel K-means and K-medians (by @mrfh92)
  • #1228 Introduce in-place-operators for arithmetics.py (by @LScheib)
  • #1218 Distributed Fast Fourier Transforms (by @ClaudiaComito)

Bug fixes

  • #1363 ht.array constructor respects implicit torch device when copy is set to false (by @JuanPedroGHM)
  • #1216 Avoid unnecessary gathering of distributed operand (by @samadpls)
  • #1329 Refactoring of QR: stabilized Gram-Schmidt for split=1 and TS-QR for split=0 (by @mrfh92)

Interoperability

  • #1418 and #1290: Support PyTorch 2.2.2 (by @mtar)
  • #1315 and #1337: Fix some NumPy deprecations in the core and statistics tests (by @FOsterfeld)

Contributors

@ClaudiaComito, @FOsterfeld, @JuanPedroGHM, @LScheib, @mrfh92, @mtar, @samadpls

- Python
Published by github-actions[bot] about 2 years ago

heat - Bug fixes, Docker documentation update

Bug fixes

  • #1259 Bug-fix for ht.regression.Lasso() on GPU (by @mrfh92)
  • #1201 Fix ht.diff for 1-element-axis edge case (by @mtar)

Changes

Interoperability

  • #1257 Docker release 1.3.x update (by @JuanPedroGHM)

Maintenance

  • #1274 Update version before release (by @ClaudiaComito)
  • #1267 Unit tests: Increase tolerance for ht.allclose on ht.inv operations for all torch versions (by @ClaudiaComito)
  • #1266 Sync pre-commit configuration with main branch (by @ClaudiaComito)
  • #1264 Fix Pytorch release tracking workflows (by @mtar)
  • #1234 Update sphinx package requirements (by @mtar)
  • #1187 Create configuration file for Read the Docs (by @mtar)

Contributors

@ClaudiaComito, @JuanPedroGHM, @bhagemeier, @mrfh92 and @mtar

- Python
Published by github-actions[bot] over 2 years ago

heat - Scalable SVD, GSoC`22 contributions, Docker image, PyTorch 2 support, AMD GPUs acceleration

This release includes many important updates (see below). We particularly would like to thank our enthusiastic GSoC2022 / tentative GSoC2023 contributors @Mystic-Slice @neosunhan @Sai-Suraj-27 @shahpratham @AsRaNi1 @Ishaan-Chandak ๐Ÿ™๐Ÿผ Thank you so much!

Highlights

  • #1155 Support PyTorch 2.0.1 (by @ClaudiaComito)
  • #1152 Support AMD GPUs (by @mtar)
  • #1126 Distributed hierarchical SVD (by @mrfh92)
  • #1028 Introducing the sparse module: Distributed Compressed Sparse Row Matrix (by @Mystic-Slice)
  • Performance improvements:
    • #1125 distributed heat.reshape() speed-up (by @ClaudiaComito)
    • #1141 heat.pow() speed-up when exponent is int (by @ClaudiaComito @coquelin77 )
    • #1119 heat.array() default to copy=None (e.g., only if necessary) (by @ClaudiaComito @neosunhan )
  • #970 Dockerfile and accompanying documentation (by @bhagemeier)

Changelog

Array-API compliance / Interoperability

  • #1154 Introduce DNDarray.__array__() method for interoperability with numpy, xarray (by @ClaudiaComito)
  • #1147 Adopt NEP29, drop support for PyTorch 1.7, Python 3.6 (by @mtar)
  • #1119 ht.array() default to copy=None (e.g., only if necessary) (by @ClaudiaComito)
  • #1020 Implement broadcast_arrays, broadcast_to (by @neosunhan)
  • #1008 API: Rename keepdim kwarg to keepdims (by @neosunhan)
  • #788 Interface for DPPY interoperability (by @coquelin77 @fschlimb )

New Features

  • #1126 Distributed hierarchical SVD (by @mrfh92)
  • #1020 Implement broadcast_arrays, broadcast_to (by @neosunhan)
  • #983 Signal processing: fully distributed 1D convolution (by @shahpratham)
  • #1063 add eq to Device (by @mtar)

Bug Fixes

  • #1141 heat.pow() speed-up when exponent is int (by @ClaudiaComito)
  • #1136 Fixed PyTorch version check in sparse module (by @Mystic-Slice)
  • #1098 Validates number of dimensions in input to ht.sparse.sparse_csr_matrix (by @Ishaan-Chandak)
  • #1095 Convolve with distributed kernel on multiple GPUs (by @shahpratham)
  • #1094 Fix division precision error in random module (by @Mystic-Slice)
  • #1075 Fixed initialization of DNDarrays communicator in some routines (by @AsRaNi1)
  • #1066 Verify input object type and layout + Supporting tests (by @Mystic-Slice)
  • #1037 Distributed weighted average() along tuple of axes: shape of weights to match shape of input (by @Mystic-Slice)

Benchmarking

  • #1137 Continous Benchmarking of runtime (by @JuanPedroGHM)

Documentation

  • #1150 Refactoring for efficiency and readability (by @Sai-Suraj-27)
  • #1130 Reintroduce Quick Start (by @ClaudiaComito)
  • #1079 A better README file (by @Sai-Suraj-27)

Linear Algebra

Contributors

@AsRaNi1, @ClaudiaComito, @Ishaan-Chandak, @JuanPedroGHM, @Mystic-Slice, @Sai-Suraj-27, @bhagemeier, @coquelin77, @mrfh92, @mtar, @neosunhan, @shahpratham

- Python
Published by github-actions[bot] almost 3 years ago

heat - Bug fixes, support OpenMPI>=4.1.2, support PyTorch 1.13.1

Changes

Communication

  • #1058 Fix edge-case contiguity mismatch for Allgatherv (by @ClaudiaComito)

Contributors

@ClaudiaComito, @JuanPedroGHM

- Python
Published by github-actions[bot] over 3 years ago

heat - Support PyTorch 1.13, Lanczos decomposition fix, bug fixes

Changes

  • #1048 Support PyTorch 1.13.0 on branch release/1.2.x (by @github-actions)

๐Ÿ› Bug Fixes

  • #1038 Lanczos decomposition linalg.solver.lanczos: Support double precision, complex data types (by @ClaudiaComito)
  • #1034 ht.array, closed loophole allowing DNDarray construction with incompatible shapes of local arrays (by @Mystic-Slice)

Linear Algebra

  • #1038 Lanczos decomposition linalg.solver.lanczos: Support double precision, complex data types (by @ClaudiaComito)

๐Ÿงช Testing

  • #1025 mirror repository on gitlab + ci (by @mtar)
  • #1014 fix: set cuda rng state on gpu tests for test_random.py (by @JuanPedroGHM)

Contributors

@ClaudiaComito, @JuanPedroGHM, @Mystic-Slice, @coquelin77, @mtar, @github-actions, @github-actions[bot]

- Python
Published by github-actions[bot] over 3 years ago

heat - v1.2.0: GSoC22, introducing `signal` module, parallel I/O and more

Highlights

  • We have been selected as a mentoring organization for Google Summer of Code, and we already have many new contributors (see below). Thank you!
  • Heat now supports PyTorch 1.11
  • Gearing up to support data-intensive signal processing: introduced signal module and memory-distributed 1-D convolution with ht.convolve()
  • Parallel I/O: you can now parallelize writing out to CSV file with ht.save_csv().
  • Introduced more flexibility in memory-distributed binary operations.
  • Expanded functionalities in linalg, manipulations modules.

What's Changed

  • Bug/825 setitem slice dndarrays by @coquelin77 in https://github.com/helmholtz-analytics/heat/pull/826
  • Features/807 roll by @mtar in https://github.com/helmholtz-analytics/heat/pull/829
  • implement vecdot by @mtar in https://github.com/helmholtz-analytics/heat/pull/840
  • Enhancement/798 logical dndarrray by @mtar in https://github.com/helmholtz-analytics/heat/pull/851
  • add moveaxis by @mtar in https://github.com/helmholtz-analytics/heat/pull/854
  • add swapaxes by @mtar in https://github.com/helmholtz-analytics/heat/pull/853
  • norm implementation by @mtar in https://github.com/helmholtz-analytics/heat/pull/846
  • Features/178 tile by @ClaudiaComito in https://github.com/helmholtz-analytics/heat/pull/673
  • Features/torch proxy by @ClaudiaComito in https://github.com/helmholtz-analytics/heat/pull/856
  • add normal, standard_normal by @mtar in https://github.com/helmholtz-analytics/heat/pull/858
  • add signbit by @mtar in https://github.com/helmholtz-analytics/heat/pull/862
  • add sign, sgn by @mtar in https://github.com/helmholtz-analytics/heat/pull/827
  • vdot implementation by @mtar in https://github.com/helmholtz-analytics/heat/pull/842
  • Bugfix/529 lasso example by @bhagemeier in https://github.com/helmholtz-analytics/heat/pull/876
  • fix binary_op on operands with single element by @mtar in https://github.com/helmholtz-analytics/heat/pull/868
  • conjugate method in DNDarray by @mtar in https://github.com/helmholtz-analytics/heat/pull/885
  • add cross by @mtar in https://github.com/helmholtz-analytics/heat/pull/850
  • Feature/337 determinant by @mtar in https://github.com/helmholtz-analytics/heat/pull/877
  • Features/746 print0 print toggle by @coquelin77 in https://github.com/helmholtz-analytics/heat/pull/816
  • Feature/338 matrix inverse by @mtar in https://github.com/helmholtz-analytics/heat/pull/875
  • randint accept ints for 'size' by @mtar in https://github.com/helmholtz-analytics/heat/pull/916
  • Support PyTorch 1.11.0 by @github-actions in https://github.com/helmholtz-analytics/heat/pull/932
  • 750 save csv v2 by @bhagemeier in https://github.com/helmholtz-analytics/heat/pull/941
  • added duplicate comm by @Dhruv454000 in https://github.com/helmholtz-analytics/heat/pull/940
  • changed documentation small fix by @Dhruv454000 in https://github.com/helmholtz-analytics/heat/pull/956
  • add digitize/bucketize by @mtar in https://github.com/helmholtz-analytics/heat/pull/928
  • Improve save_csv string formatting by @bhagemeier in https://github.com/helmholtz-analytics/heat/pull/948
  • Random: Replaced factories.array with DNDarray by @shahpratham in https://github.com/helmholtz-analytics/heat/pull/960
  • Features/30 convolve by @lucaspataro in https://github.com/helmholtz-analytics/heat/pull/595
  • Add out and where args for ht.div by @neosunhan in https://github.com/helmholtz-analytics/heat/pull/945

New Contributors

  • @Dhruv454000 made their first contribution in https://github.com/helmholtz-analytics/heat/pull/940
  • @shahpratham made their first contribution in https://github.com/helmholtz-analytics/heat/pull/960
  • @neosunhan made their first contribution in https://github.com/helmholtz-analytics/heat/pull/945

Full Changelog: https://github.com/helmholtz-analytics/heat/compare/v1.1.0...v1.2.0

- Python
Published by ClaudiaComito about 4 years ago

heat - Tuning dependencies, minor documentation edits

v1.1.1

  • #864 Dependencies: constrain torchvision version range to match supported pytorch version range.

- Python
Published by ClaudiaComito over 4 years ago

heat - Heat 1.1: distributed slicing/indexing overhaul, dealing with load imbalance, and more

Highlights

  • Slicing/indexing overhaul for a more NumPy-like user experience. Special thanks to Ben Bourgart @ben-bou and the TerrSysMP group for this one. Warning for distributed arrays: breaking change! Indexing one element along the distribution axis now implies the indexed element is communicated to all processes.
  • More flexibility in handling non-load-balanced distributed arrays.
  • More distributed operations, incl. meshgrid.

For other details, see the CHANGELOG.

- Python
Published by ClaudiaComito almost 5 years ago

heat - Heat 1.0: Data Parallel Neural Networks, and more

Release Notes

Heat v1.0 comes with some major updates: - new module nn for data-parallel neural networks - Distributed Asynchronous and Selective Optimization (DASO) to accelerate network training on multi-GPU architectures - support for complex numbers - major documentation overhaul - support channel on StackOverflow - support PyTorch 1.8 - stop supporting Python 3.6 - many more updates and bug fixes, check out the CHANGELOG

- Python
Published by ClaudiaComito about 5 years ago

heat - Pinning PyTorch version 1.6 for now, plus bug fixes

We're pinning PyTorch to version 1.6 after having run into problems with the recently released 1.7. This is a temporary solution!

Also, bug fixes:

  • #678 Bug fix: Internal functions now use explicit device parameters for DNDarray and torch.Tensor initializations.
  • #684 Bug fix: distributed reshape now works on booleans as well.

- Python
Published by ClaudiaComito over 5 years ago

heat - v0.5.0 - Distributed kmedian, kmedoids and knn, statistical functions, DNDarray manipulations, random sampling

HeAT 0.5.0 Release Notes

New features

  • Parallel high-level algorithms: more clustering methods with cluster.KMedian, cluster.KMedoids, and one classification method (K-Nearest Neighbors, classification.knn). Also new: Manhattan distance metric (spatial.manhattan).
  • Parallel statistical functions: percentileand median, skew, kurtosis.
  • Parallel DNDarray manipulations: pad, fliplr, rot90, stack, column_stack, row_stack.
  • Parallel linear algebra: outer.
  • Parallel random sampling: random.permutation, random.random_sample, random.random, random.sample, random.ranf, random.random_integer ## Performance

- Python
Published by ClaudiaComito over 5 years ago

heat - QR solver, Tensor manipulations, Halos, Spectral clustering, and more

The HeAT v0.4.0 release is now available.

We are striving to be as NumPy-API-compatible as possible while providing MPI-parallelized implementation of all features.

Highlights

  • #429 Submodule for Linear Algebra: Implemented QR, sped-up matrix multiplication
  • #511 New: reshape
  • #518 New: Spectral Clustering
  • #522 Added CUDA-aware MPI detection for MVAPICH, MPICH and ParaStation
  • #535 Introduction of BaseEstimator and clustering, classification and regression mixins
  • #541 Introduction of basic halo scheme for inter-rank operations
  • #558 Added support for PyTorch 1.5.0

Other new features

  • Updated documentation theme to "Read the Docs"
  • #429 Implemented a tiling class to create Square tiles along the diagonal of a 2D matrix
  • #496 flipud()
  • #498 flip()
  • #501 flatten()
  • #520 SplitTiles class, computes tiles with theoretical and actual split axes
  • #524 cumsum() & cumprod()
  • #534 eye() now supports all 2D split combinations and matrix configurations.

Bug fixes

  • #483 Underlying torch tensor moves to the correct device on heat.array initialization
  • #483 DNDarray.cpu() changes heat device to cpu
  • #499 MPI datatype mapping: torch.int16 now maps to MPI.SHORT instead of MPI.SHORT_INT
  • #506 setup.py has correct version parsing
  • #507 sanitize_axis changes axis of scalars to None
  • #515 Numpy-API compliance: ht.var() now returns the unadjusted sample variance by default, Bessel's correction can be applied by setting ddof=1.
  • #519 parallel slicing with empty list or scalar as input; nonzero() of empty (process-local) tensor.
  • #520 resplit returns correct values for all split configurations.
  • #521 Added documentation for the generic reduce_op in Heat's core
  • #526 float32 is now consistent default dtype for factories.
  • #531 Tiling objects are not separate from the DNDarray
  • #558 sanitize_memory_layout assumes default memory layout of the input tensor
  • #562 split semantics of ht.squeeze()
  • #567 setitem to ignore split axis differences, exception will come from torch if shapes mismatch

- Python
Published by ClaudiaComito about 6 years ago

heat - v0.3.0

  • #454 Update lasso example
  • #473 Matmul now will not split any of the input matrices if both have split=None. To toggle splitting of one input for increased speed use the allow_resplit flag.
  • #473 dot handles 2 split None vectors correctly now
  • #470 Enhancement: Accelerate distance calculations in kmeans clustering by introduction of new module spatial.distance
  • #478 ht.array now typecasts the local torch tensors if the torch tensors given are not the torch version of the specified dtype + unit test updates
  • #479 Completion of spatial.distance module to support 2D input arrays of different splittings (None or 0) and different datatypes, also if second input argument is None

- Python
Published by bhagemeier over 6 years ago

heat - v0.2.2

v0.2.2

This version adds support for PyTorch 1.4.0. There are also several minor feature improvements and bug fixes listed below. - #443 added option for neutral elements to be used in the place of empty tensors in reduction operations (operations.__reduce_op) (cf. #369 and #444) - #445 var and std both now support iterable axis arguments - #452 updated pull request template - #465 bug fix: x.unique() returns a DNDarray both in distributed and non-distributed mode (cf. [#464]) - #463 Bugfix: Lasso tests now run with both GPUs and CPUs

- Python
Published by bhagemeier over 6 years ago

heat - 0.2.1

v0.2.1

This version fixes the packaging, such that installed versions of HeAT contain all required Python packages.

v0.2.0

This version varies greatly from the previous version (0.1.0). This version includes a great increase in functionality and there are many changes. Many functions which were working previously now behave more closely to their numpy counterparts. Although a large amount of progress has been made, work is still ongoing. We appreciate everyone who uses this package and we work hard to solve the issues which you report to us. Thank you!

Package Requirements

  • python >= 3.5
  • mpi4py >= 3.0.0
  • numpy >= 1.13.0
  • torch >= 1.3.0

Optional Packages

  • h5py >= 2.8.0
  • netCDF4 >= 1.4.0, <= 1.5.2
  • pre-commit >= 1.18.3 (development requirement)

Additions

GPU Support

#415 GPU support was added for this release. To set the default device use ht.use_device(dev) where dev can be either "gpu" or "cpu". Make sure to specify the device when creating DNDarrays if the desired device is different than the default. If no device is specified then that device is assumed to be "cpu".

Basic Operations

Basic Multi-DNDarray Operations

Developmental

  • Code of conduct
  • Contribution guidelines
    • pre-commit and black checks added to Pull Requests to ensure proper formatting
  • Issue templates
  • #357 Logspace factory
  • #428 lshape map creation
  • Pull Request Template
  • Removal of the ml folder in favor of regression and clustering folders
  • #365 Test suite

Linear Algebra and Statistics

Regression, Clustering, and Misc.

  • #307 lasso regression example
  • #308 kmeans scikit feature completeness
  • #435 Parter matrix

Bug Fixes

  • KMeans bug fixes
    • Working in distributed mode
    • Fixed shape cluster centers for init='kmeans++'
  • _localop now returns proper gshape
  • allgatherv fix -> elements now sorted in the correct order
  • getitiem fixes and improvements
  • unique now returns a distributed result if the input was distributed
  • AllToAll on single process now functioning properly
  • optional packages are truly optional for running the unit tests
  • the output of mean and var (and std) now set the correct split axis for the returned DNDarray

- Python
Published by mtar over 6 years ago

heat - 0.0.5-citation

- Python
Published by Markus-Goetz over 7 years ago