Recent Releases of https://github.com/numbagg/numbagg

https://github.com/numbagg/numbagg - 0.9.0

0.9.0 implements our own "dynamic" compilation for grouped functions, in lieu of numba's (which currently doesn't work with parallel functions). This allows us to compile a function only for the types of the current function call's arguments, rather than all possible types allowed by the function. The speeds up the JIT compilation of a single function call by ~4x for the grouped functions.

- Python
Published by max-sixty about 1 year ago

https://github.com/numbagg/numbagg - 0.8.2

0.8.2 reduces numerical instability in moving (aka rolling) functions with very short windows, as well as slightly improving their performance.

- Python
Published by max-sixty over 1 year ago

https://github.com/numbagg/numbagg - 0.8.1

0.8.1 adds an experimental NUMBAGG_FASTMATH env var option (thanks @frazane) which increases performance in some routines at the cost of minor inaccuracy. Feel free to provide feedback in an issue if you find this helpful (or unhelpful!). There's also a change for numpy 2.0 compatibility (thanks @mathause), and some internal improvements.

- Python
Published by max-sixty almost 2 years ago

https://github.com/numbagg/numbagg - 0.8.0

0.8.0 includes nanmedian, a wrapper of nanquantile with one quantile of 0.5.

- Python
Published by max-sixty about 2 years ago

https://github.com/numbagg/numbagg - 0.7.2

0.7.2 raises an error if values outside [0, 1] are passed to nanquantile

- Python
Published by max-sixty about 2 years ago

https://github.com/numbagg/numbagg - 0.7.1

0.7.1 removes a stray print statement from the code. Thanks to @mathause for raising and fixing the issue.

- Python
Published by max-sixty about 2 years ago

https://github.com/numbagg/numbagg - v0.7.0

0.7.0 adds a ddof argument to std & var aggregation & grouping functions. Internally, there are lots of new benchmarks, which are more clearly presented in the Readme, and added some initial property tests.

- Python
Published by max-sixty about 2 years ago

https://github.com/numbagg/numbagg - 0.6.8

0.6.8 contains mostly internal changes — the initial benchmarking approach is expanded to all functions and displayed in the new Readme. The same framework is now used to test all functions. We also ensure the functions don't emit warnings when handling expected inputs in our tests.

- Python
Published by max-sixty about 2 years ago

https://github.com/numbagg/numbagg - 0.6.7

0.6.7 removes the temporary patch for the int8 issues we experienced previously in grouping functions, replacing it with something more robust. Specifically, when there are a very large number of items in a group and labels has a very small dtype, labels is cast to a higher dtype.

- Python
Published by max-sixty about 2 years ago

https://github.com/numbagg/numbagg - 0.6.6

Following closely on the heels of 0.6.5, 0.6.6 works around another rare but serious bug with int8 types. We now coerce all int8 label arrays to int16.

Many thanks to @dcherian for the report.

- Python
Published by max-sixty about 2 years ago

https://github.com/numbagg/numbagg - 0.6.5

0.6.5 works around a rare but serious bug — when a labels array with int8 type is used in a group function, numbagg can return an incorrect result. The bug requires the array to be a specific size. The currently implemented solution is a workaround rather than an understanding of the underlying issue. Check out https://github.com/numbagg/numbagg/issues/211 for more details.

- Python
Published by max-sixty about 2 years ago

https://github.com/numbagg/numbagg - 0.6.4

0.6.4 fixes a small bug — the value for the window argument for rolling methods couldn't be equal to the axis length.

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.6.3

Numbagg will now compile withmode="cpu" if it detects that it's being run in a ThreadPoolExecutor. Previously, the default mode="parallel" could cause numba to abort the python program within that context.

Note that running in a multi-process context retains mode="parallel", so the new behavior should only be slower in infrequent cases, such as a local dask multi-threaded executor.

I'm not completely confident this is the globally optimal solution, so this may evolve. https://github.com/numba/numba/issues/9288 has more context.

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.6.2

0.6.2 allows grouping functions to take a wider range of int types as labels. Thanks to @dcherian for the contribution.

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.6.1

0.6.1: - Enables parallel mode in most functions. This radically improves performance in multi-core systems on multi-dimensional arrays (see benchmarks for details) - Allows passing an array of alphas in the moving_exp functions, which lets us decay values by different amounts - Improves nanquantile's compatibility with various axis values - Extends benchmarks to different shapes, adds bottleneck as a comparison

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.6.0

  • Add ffill & bfill, at ~2.7x pandas' performance
  • Add standard moving window functions — move_corr, move_cov, move_std, move_sum, move_var, in addition to the existing move_mean. These have 3.5-20x pandas' performance.
  • New benchmarks using pytest-benchmark. This includes a script which makes a nice output which we've added to the readme. It currently only covers the moving and moving_exp functions.

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.5.1

  • Add a nanquantile function; approximately 4x faster than np.nanquantile when over 2 dimensions. It's slightly slower than np.quantile and pandas' .quantile
  • Ensure we don't produce inf values for some exponential moving functions. Numerical values remain unchanged.

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.5.0

  • Sets ddof=1 for std & var functions, mirroring the grouped & move_exp functions (but notably different from numpy)
  • Adds a move_exp_nancount functions, for exponentially weighted moving counts
  • Adds nancount as an alias for count

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.4.5

0.4.5 fixes an issue with our new PyPI release workflow. 0.4.1-4 were not published to PyPI

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.4.4

0.4.4 fixes an issue with our new PyPI release workflow. 0.4.1-3 were not published to PyPI

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.4.3

0.4.3 fixes an issue with our new PyPI release workflow. 0.4.1 & 0.4.2 were not published to PyPI

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.4.2

0.4.2 fixes an issue with our new PyPI release workflow. 0.4.1 was not published to PyPI

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.4.1

0.4.1 fixes an issue with move_exp_nanstd not accepting an axis kwarg.

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.4.0

0.4.0 adds some more exponentially weighted functions: - move_exp_nanstd - move_exp_nanvar - move_exp_nancorr - move_exp_nancov

Because functions can now take more than one array, the signature of the moving exponential functions has changed slightly to require alpha to be a keyword argument. This is technically a breaking change, though most consumers will be passing alpha as a kwarg already (xarray included).

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.3.1

This release adds a min_weight parameter to the exponential moving functions, so that it's possible to output values if there's a sufficient number of recent valid values — similar to the min_count count parameter to the simple moving functions.

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - 0.3.0

After a lengthy hiatus of development on numbagg, we're back with a big release:

  • Lots of new grouping functions, in an attempt to be an engine for flox, a library from @dcherian & others. The functions include:

    • group_nancount
    • group_nanargmax, group_nanargmin
    • group_nanfirst, group_nanlast
    • group_nansum_of_squares
    • group_nanprod
    • group_nanall, group_nanany
    • group_nanvar
    • group_nanstd
    • group_nanmax, group_nanmin
  • Lots of performance improvements to existing grouping functions

    • Initial benchmarking shows 2-5x the performance over pandas' equivalent functions (though mostly towards the lower end, and the benchmarks are not as robust as I'd like; feedback and verifications welcome).
  • Large test coverage expansion of grouping functions

  • Improvements to the exponentially weighted moving functions:

    • A new move_exp_nanvar function
    • Code simplification and modest performance improvements to existing functions
    • Benchmarks show 1-5x the performance of pandas' equivalent functions.
  • A modest performance gain to existing moving functions.

  • Internally, we've removed some of the original hacks that were initially required. Thanks to numbagg for supporting many of these natively!

The documentation needs a pass — the Readme could be reorganized, and the benchmarks could be more systematically measured and reported. It's possible that these large changes have introduced small bugs — particularly around edge cases, such as unfamiliar dtypes. That said, the main use cases are quite well-tested, and we have pandas & numpy to thank for excellent comparisons to test against.)

Please report any issues or questions. I (@max-sixty) am excited numbagg is back, and will gauge how much to add on the extent to which folks find it useful. And ofc thanks to @shoyer for writing the original library!

- Python
Published by max-sixty over 2 years ago

https://github.com/numbagg/numbagg - v0.2.2

Fixes embedded version number

- Python
Published by max-sixty about 3 years ago

https://github.com/numbagg/numbagg - v0.2.1

- Python
Published by max-sixty over 4 years ago

https://github.com/numbagg/numbagg - v0.2.0

- Python
Published by max-sixty almost 5 years ago