Recent Releases of dask

dask - 2025.7.0

Changes

  • CI: update actions location @bsipocz (#12019)
  • Enable column projection in MapPartitions @rjzamora (#11875)
  • Apply ruff/flake8-comprehensions rules (C4) @DimitriPapadopoulos (#12004)
  • Apply ruff/flake8-pie rules (PIE) @DimitriPapadopoulos (#12006)
  • Apply ruff/Pylint Error rules (PLE) @DimitriPapadopoulos (#12013)
  • Apply ruff/Pylint Convention rules (PLC) @DimitriPapadopoulos (#12012)
  • Apply ruff/flake8-pyi rules (PYI) @DimitriPapadopoulos (#12007)
  • Apply ruff/flake8-simplify rules (SIM) @DimitriPapadopoulos (#12008)
  • Apply ruff/Pylint Warning rules (PLW) @DimitriPapadopoulos (#12011)
  • Apply ruff/flake8-implicit-str-concat rules (ISC) @DimitriPapadopoulos (#12005)
  • Apply ruff/pycodestyle rule E714 @DimitriPapadopoulos (#12000)
  • Fix typos found by codespell @DimitriPapadopoulos (#12001)
  • Update PyPI URL for official nightly pyarrow repository @raulcd (#11996)
  • Fall-back to textual repr in case jinja2 is not installed @lukasbindreiter (#11987)
  • Prevent builtins.any from being shadowed in dask.array.reductions @m-albert (#11988)
  • Bump conda-incubator/setup-miniconda from 3.1.1 to 3.2.0 @dependabot[bot] (#11982)
  • Skip groupby cov test for pandas 3.x @TomAugspurger (#11977)
  • Account for \_\_main\_\_ in pickle normalization @jrbourbeau (#11970)
  • Fix upstream CI installation @jrbourbeau (#11976)
  • Make module name logic more resilient in Dispatch @jrbourbeau (#11974)

See the Changelog for more information.

- Python
Published by github-actions[bot] 11 months ago

dask - 2025.5.1

Changes

  • Revert "Dont handle tuple in task_spec.parse_input" @fjetter (#11953)
  • Optimize slicing graph generation @fjetter (#11946)
  • Fix xarray slicing regression @fjetter (#11947)
  • Dont handle tuple in task_spec.parse_input @fjetter (#11948)

See the Changelog for more information.

- Python
Published by github-actions[bot] about 1 year ago

dask - 2025.5.0

Changes

  • Speed up slicing graph generation @fjetter (#11945)
  • Fixed Array.setitem when both the array and the indexer have unknown shape @TomAugspurger (#11943)
  • Optimize dask order for worst case of get_target @fjetter (#11935)
  • Raise on local executor if tasks are missing dependency @fjetter (#11944)
  • Fix to_dask_array for single partition @jrbourbeau (#11931)
  • Ensure parquet plan is fully cached during optimization @fjetter (#11933)
  • Better documentation for expression system @fjetter (#11915)
  • Simplify (and speed up) culling @fjetter (#11899)
  • Update pre-commit @fjetter (#11926)
  • Map_partitions again accepts delayed objects @fjetter (#11907)
  • Fix delayed parsing for futures @fjetter (#11917)
  • Don't run post setup-miniconda step in CI @jrbourbeau (#11925)
  • Try to pin pip for readthedocs @fjetter (#11923)
  • Fix windows CI @fjetter (#11919)

See the Changelog for more information.

- Python
Published by github-actions[bot] about 1 year ago

dask - 2025.4.1

Changes

  • Ensure only HLGs are probited reuse @fjetter (#11906)
  • Ensure xarray objects can continue sharing dependencies @fjetter (#11904)
  • Ensure culling changes layer names @fjetter (#11903)
  • Ensure FusedIO does not break Blockwise alignment Assign @fjetter (#11898)
  • Implement ufuncs and gufunc for array-expr @phofl (#11818)
  • Implement map_overlap for array-expr @phofl (#11822)

See the Changelog for more information.

- Python
Published by github-actions[bot] about 1 year ago

dask - 2025.4.0

Changes

  • Ensure Future value is in da.from_delayed task graph @TomAugspurger (#11896)
  • Fix annotations passed to delayed @fjetter (#11893)
  • migrate delayed unpack_collections @fjetter (#11881)
  • Remove Pub/Sub references from docs @jrbourbeau (#11891)
  • Ensure only classes without custom init are singletons @fjetter (#11886)
  • Remove custom initializers for delayed expressions @fjetter (#11888)
  • Fix persisting multiple DFs at the same time @fjetter (#11887)
  • Avoid always parsing list inputs to DataFrame.isin as object type numpy arrays @mroeschke (#11869)
  • Unskip pandas-dev cov/corr tests @TomAugspurger (#11873)
  • Hlg blockwise fix @fjetter (#11871)
  • Ensure annotations for HLG objects are properly generated @fjetter (#11866)
  • Factor out singleton logic from base Expr class @fjetter (#11868)
  • Ensure HLGs are using dependencies properly in optimization @fjetter (#11859)
  • Ensure dictionaries tokenize deterministically @fjetter (#11867)
  • Ensure default dask scheduler only compute what's needed @fjetter (#11861)
  • Faster tokenization of pd.RangeIndex @fjetter (#11863)
  • Update link to Quansight in community doc @pavithraes (#11860)
  • Relax tolerance in autocorr test @TomAugspurger (#11857)
  • Use map_blocks in array.store to avoid materialization and dropping of annotations @fjetter (#11844)
  • Ensure repartition does not trigger memory size computation during lowering (i.e. on the scheduler) @fjetter (#11855)
  • Support args and kwargs for rolling aggregations @fjetter (#11856)
  • Remove nightly h5py from upstream CI job @jrbourbeau (#11847)
  • Ensure HLGExpr tokenize uniquely @fjetter (#11849)
  • Do not inject median in describe for pandas 3 @fjetter (#11846)
  • Fixed Expr.__setattr__ for subclasses @TomAugspurger (#11845)
  • Wrap HLGs in an Expr to avoid Client side materialization @fjetter (#11736)

See the Changelog for more information.

- Python
Published by github-actions[bot] about 1 year ago

dask - 2025.3.0

Changes

  • Fix dataset info cache assignment @fjetter (#11840)
  • Expr setattr @fjetter (#11836)
  • Follow up to expression tokenization caching @fjetter (#11837)
  • Consolidate getattr for expr classes @fjetter (#11835)
  • Reduce pickle size of ReadParquet expression @fjetter (#11797)
  • arange loses precision on ~2**63 @crusaderky (#11801)
  • Remove numbagg from upstream build @phofl (#11821)
  • Dispatch to numbagg for nanmedian and nanquantile @phofl (#11817)
  • Make missing meta warning more ergonomic @phofl (#11814)
  • Remove name doc from from_pandas @phofl (#11812)
  • Implement an Array Scalar @phofl (#11810)
  • Added to\_orc to DataFrame API @TomAugspurger (#11807)
  • Implement reverse indexing for DataFrames @phofl (#11803)
  • Add lazy to_pandas_dispatch registration for cudf @rjzamora (#11799)
  • Fix missing imports in array-expr @fjetter (#11796)
  • Cache tokens on expressions and restore after pickle roundtrip @fjetter (#11791)
  • Use random dashboard ports for LocalCluster in distributed tests @fjetter (#11795)
  • Implement slicing for array-expr @phofl (#11783)
  • Never use an asynchronous Client when calling top level compute function @fjetter (#11790)
  • Refactor import tests @fjetter (#11794)
  • Migrate base.unpack_collections to Task class @fjetter (#11793)
  • Ensure map_blocks generates unique tokens @fjetter (#11792)
  • Speed up normalize_pickle by 50 percent @fjetter (#11788)
  • Fix divisions calculation with duplicates @phofl (#11787)
  • Fix assign align for duplicated divisions @phofl (#11786)
  • Ensure concat optimize project does not raise @fjetter (#11784)
  • Add array-expr from_array @phofl (#11772)
  • Keep chunksizes consistent in apply\_gufunc @phofl (#11683)
  • Test dask.dataframe.__all__ @flying-sheep (#11782)
  • Add __all__ to dask.bag @flying-sheep (#11781)
  • add test for dask.array.__all__ @flying-sheep (#11780)
  • Bump JamesIves/github-pages-deploy-action from 4.7.2 to 4.7.3 @dependabot[bot] (#11777)
  • Export dask.array members @flying-sheep (#11779)
  • Fix sorted_divisions_locations with duplicates @TomAugspurger (#11773)
  • Small typo in best-practices.rst @SCORE1387 (#11775)
  • Allow unknown chunks in blockwise adjust_chunks @lgray (#11769)
  • Fix crash in asarray(..., like=...) vs. scipy.sparse objects @crusaderky (#11755)
  • Remove flaky optional dependency @TomAugspurger (#11771)
  • Add support for scipy sparray @flying-sheep (#11750)
  • Added flaky to tests extra @TomAugspurger (#11770)
  • Ensure divisions are plain scalars @TomAugspurger (#11767)
  • Remove divisions code duplication @fjetter (#11764)
  • Ensure divisions not diverging from npartitions in Merge @fjetter (#11762)
  • skip test_visualize_int_overflow on windows @fjetter (#11761)
  • Reduce pickle size for tasks @fjetter (#11687)
  • Implement unify_chunks and Rechunk @phofl (#11692)
  • Fix expression getitem to avoid alignment @phofl (#11760)
  • arange(..., like=x) embeds the graph of x @crusaderky (#11754)
  • Simplify assert_divisions @fjetter (#11745)
  • Fix Projection logic for Series objects @phofl (#11747)
  • Remove bytes as keys @fjetter (#11757)
  • Ensure map_partitions returns Series object if function returns scalar @fjetter (#11756)
  • Don't upload env twice @phofl (#11748)

See the Changelog for more information.

- Python
Published by github-actions[bot] about 1 year ago

dask - 2025.2.0

Changes

  • Add big array example @jrbourbeau (#11744)
  • Fix exploding chunksizes in pad for constant padding @phofl (#11743)
  • Move optimize method to base class @fjetter (#11742)
  • Add changelog entry for fixed deadlock @hendrikmakait (#11741)
  • Fix graph creation in dask-expr to_delayed @phofl (#11739)
  • Remove culling from delayed optimisation @phofl (#11737)
  • Compute meta for from_map on the cluster @phofl (#11738)
  • Bugs in __setitem__ with dask bool mask @crusaderky (#11728)
  • Implement infrastructure, random, blockwise and Elemwise @phofl (#11689)
  • array/asarray with both like= and dtype= @crusaderky (#11733)
  • Fix annotations warnings test @phofl (#11734)
  • Catch warnings when writing to remote storage with to_parquet @phofl (#11731)
  • Remove LocalCluster from tests @phofl (#11729)
  • Fix partition pruning when using from_array @phofl (#11725)
  • Fix concatentation with mixed dtype columns @phofl (#11727)
  • arange: fix extreme values @crusaderky (#11707)
  • Graph corruption on scalar getitem->setitem @crusaderky (#11723)
  • array: Never share buffers after compute() @crusaderky (#11697)
  • Extract Dask Array from xarray DataArray in from_array @phofl (#11712)
  • arange: support kwargs @crusaderky (#11710)
  • Ensure normalize_token is threadsafe @fjetter (#11709)
  • Expand advise for instance types and processes @fjetter (#11705)
  • Drop legacy timeseries implementation @fjetter (#11704)
  • Update Dask Cloud Provider documentation to include Nebius as a supported cloud option @SalikovAlex (#11703)
  • Fix normalize_chunks when squashing into a single chunk @phofl (#11702)
  • Fix positional indexing with newaxis @phofl (#11699)
  • Set array backend in scipy-sparse-indexing @TomAugspurger (#11700)
  • Fix value_counts shuffling strategy @phofl (#11698)
  • Disentangle core expression class from dataframe specific code @phofl (#11688)
  • Bump conda-incubator/setup-miniconda from 3.1.0 to 3.1.1 @dependabot[bot] (#11685)
  • Fixup dataframe conversion from array methods @phofl (#11684)
  • Remove remaining artifacts of fastparquet @phofl (#11682)
  • Updated docs on local file system @TomAugspurger (#11677)
  • Expose TaskSpec objects for downstream projects @phofl (#11675)
  • Rename optimize_slices function @phofl (#11673)
  • Fixup changelog entry @phofl (#11674)
  • Add changelog for dataframe removal @phofl (#11654)
  • Pass read_only properly to zarr stores @phofl (#11668)
  • Revert "Revert "Add scikit-image nightly back to upstream CI"" @phofl (#11667)
  • Avoid Dict in fused tasks @hendrikmakait (#11657)
  • Fix filtering on parquet file containing a struct column @rjzamora (#11665)
  • Fix merge asof simplify after lowering @phofl (#11658)
  • Add redirects for groupby docs after dask-expr merge @phofl (#11661)
  • Rename remote store to FsspecStore @phofl (#11660)
  • Reintroduce slice fusion @phofl (#11638)
  • Add \_\_all\_\_ to init @phofl (#11664)
  • Expose downstream utilities for dask.dataframe @phofl (#11662)
  • Remove IO wrapper functions @phofl (#11649)
  • Reroute source of docs for dataframe methods @phofl (#11645)
  • Fix projection when columns are numpy scalars @rjzamora (#11656)
  • Let vindex accept a Dask Array indexer under certain conditions @phofl (#11635)
  • Simplify _execute_subgraph @hendrikmakait (#11655)
  • Rename data_producer and add flag to dataframe io stuff @phofl (#11653)
  • Remove subgraph callable @fjetter (#11575)
  • Add cached version for normalize\_chunks @phofl (#11650)
  • Fixed mypy config @TomAugspurger (#11651)
  • Fixup pickle size test @phofl (#11647)
  • Remove unnecessary compat code @phofl (#11644)
  • Remove pyarrow installation by default in imports check @phofl (#11646)
  • Add data-producer-task property to replace rootish detection mechanism @phofl (#11558)
  • Add cupy support for indexed assignment @rjzamora (#11421)
  • Merge dask-expr repository into dask @phofl (#11623)
  • Avoid rechunking 1D-arrays in cumreduction @Illviljan (#11446)
  • Ensure that alias key is not a TaskRef object @phofl (#11639)
  • Avoid Tuple in Dict @hendrikmakait (#11634)
  • Migrate vindex to TaskSpec @phofl (#11633)
  • Reduce graph size for vindex @phofl (#11632)
  • Fix Array binary operator priority delegation @j2bbayle (#11611)
  • Fix auto-rechunking in einsum @dcherian (#11628)
  • Optimize vindex @dcherian (#11625)
  • Fix example Actors @isidroas (#11624)
  • Avoid concatenate3 in array slicing @hendrikmakait (#11631)
  • Avoid concatenate3 in overlap and rechunking graphs @hendrikmakait (#11621)
  • Avoid using concrete in task graph @hendrikmakait (#11620)
  • Avoid producing chunks of size 0 when using dask.array.rechunk with chunks='auto' @schlunma (#11622)
  • Clean up tests after legacy removal @phofl (#11617)
  • Remove legacy DataFrame implementation @phofl (#11606)
  • Revert "Add scikit-image nightly back to upstream CI" @phofl (#11616)
  • Fix increased memory usage when converting xarray to dataframe @phofl (#11609)
  • Avoid overflowing when downcasting shuffle arrays @phofl (#11615)

See the Changelog for more information.

- Python
Published by github-actions[bot] over 1 year ago

dask - 2024.12.1

Changes

  • Fix map_overlap bug where rechunking and trim=False caused inconsistent chunkings @phofl (#11605)
  • Avoid reference to bound method in NestedContainer @hendrikmakait (#11608)
  • Avoid constructing NestedContainers in case of trivial inputs @hendrikmakait (#11600)
  • Avoid legacy implementation in read-csv @phofl (#11603)
  • Remove legacy DataFrame import @phofl (#11604)
  • asarray ignores dtype for array inputs @crusaderky (#11586)
  • Add back LLM chatbot to Dask docs @dchudz (#11594)
  • Avoid creating trivial DataNodes in graph conversion @hendrikmakait (#11598)
  • Don't wrap keys in TaskRef in Alias @hendrikmakait (#11597)
  • Bump JamesIves/github-pages-deploy-action from 4.6.9 to 4.7.2 @dependabot (#11593)
  • Migrate dask array creation routines to task spec @jrbourbeau (#11582)
  • Migrate most of dask array random to task spec @jrbourbeau (#11581)
  • Do not use local function in array.push @fjetter (#11576)

See the Changelog for more information.

- Python
Published by github-actions[bot] over 1 year ago

dask - 2024.12.0

Changes

  • Revert "Add LLM chatbot to Dask docs (#11556)" @dchudz (#11577)
  • Automatically rechunk if array in to_zarr has irregular chunks @phofl (#11553)
  • Blockwise uses Task class @fjetter (#11568)
  • Migrate rechunk and reshape to task spec @phofl (#11555)
  • Cache svg-representation for arrays @dcherian (#11560)
  • Fix empty input for containers @fjetter (#11571)
  • Convert Bag graphs to TaskSpec graphs during optimization @fjetter (#11569)
  • add LLM chatbot to Dask docs @dchudz (#11556)
  • Add support for Python 3.13 @phofl (#11456)
  • Fuse data nodes in linear fusion too @phofl (#11549)
  • Migrate slicing code to task spec @phofl (#11548)
  • Speed up ArraySliceDep tokenization @phofl (#11551)
  • Fix fusing of p2p barrier tasks @phofl (#11543)
  • Remove infra/mentions of GPU CI @charlesbluca (#11546)
  • Temporarily disable gpuCI update CI job @jrbourbeau (#11545)
  • Use BlockwiseDep to implement map_blocks keywords @phofl (#11542)
  • Remove optimize_slices @phofl (#11538)
  • Make reshape_blockwise a noop if shape is the same @phofl (#11541)
  • Remove read-only flag from open_arry in open_zarr @phofl (#11539)
  • Implement linear_fusion for task spec class @phofl (#11525)
  • Remove recursion from TaskSpec @fjetter (#11477)
  • Fixup test after dask-expr change @phofl (#11536)
  • Bump codecov/codecov-action from 3 to 5 @dependabot (#11532)
  • Create dask-expr frame directly without roundtripping @phofl (#11529)
  • Add scikit-image nightly back to upstream CI @jrbourbeau (#11530)
  • Remove from\_dask\_dataframe import @phofl (#11528)
  • Ensure that from_array creates a copy @phofl (#11524)
  • Simplify and improve performance of normalize chunks @phofl (#11521)
  • Fix flaky nanquantile test @phofl (#11518)
  • Fix tests for new read\_only kwarg in zarr=3 @phofl (#11516)

See the Changelog for more information.

- Python
Published by github-actions[bot] over 1 year ago

dask - 2024.11.2

Changes

  • Remove only_refs parsing option for TaskSpec @fjetter (#11511)
  • Fix upstream ci pandas Series repr error @phofl (#11514)
  • Implement nanpercentile for dask arrays @phofl (#11505)
  • Bump JamesIves/github-pages-deploy-action from 4.6.8 to 4.6.9 @dependabot (#11512)
  • Add fuse method for TaskSpec @fjetter (#11509)

See the Changelog for more information.

- Python
Published by github-actions[bot] over 1 year ago

dask - 2024.11.1

Changes

  • Ensure subgraphs release intermediate results @fjetter (#11510)
  • Implement faster tokenize for np dtype @phofl (#11508)
  • Add quantile to array compat doc @phofl (#11504)

See the Changelog for more information.

- Python
Published by github-actions[bot] over 1 year ago

dask - 2024.11.0

Changes

  • Add changelog for Dask release @phofl (#11502)
  • Minor updates to optional dependencies table @jrbourbeau (#11503)
  • Add push for ffill like operations @phofl (#11501)
  • Remove func packing for TaskSpec @fjetter (#11496)
  • Make tokenization for vindex more efficient @phofl (#11493)
  • Cut down runtime of einstein summation test @phofl (#11499)
  • Improve test runtime for test_rot90 @fjetter (#11498)
  • Disable low level optimization for TaskSpec in Bags @fjetter (#11495)
  • Add automatic rechunking to sliding-window-view @phofl (#11479)
  • Add load_stored kwarg to dask.array.store. @dcherian (#11465)
  • Fix quantile error in two dimensions @phofl (#11489)
  • Bump conda-incubator/setup-miniconda from 3.0.4 to 3.1.0 @dependabot (#11490)
  • Update map_blocks docstring @phofl (#11491)
  • Fix einsum with empty arrays @phofl (#11488)
  • Implement non gil-blocking quantile method @phofl (#11473)
  • Use internal keyword for trimming in map_overlap to reduce graph size @phofl (#11486)
  • minor dask order refactor @fjetter (#11467)
  • Remove empty tasks from map_overlap @phofl (#11483)
  • Fixup auto chunks calculation if single chunk goes below 1 @phofl (#11485)
  • Fix CI after pandas upstream changes @phofl (#11482)
  • Make sure that block_id and block_info don't create extra tasks @phofl (#11484)
  • Use repeat to build nearest boundary @j2bbayle (#9666)
  • Remove dead code from make_blockwise @fjetter (#11478)
  • Patch auto-chunks calculation for rioxarray @phofl (#11480)
  • Skip legacy test because of flaky warning @phofl (#11475)
  • Unskip a few dask-expr tests @phofl (#11474)
  • Keep chunk sizes consistent in einsum @phofl (#11464)
  • Improve how normalize_chunks squashes together chunks when "auto" is set @phofl (#11468)
  • Fix resolve_aliases when multiple aliases are in graph @phofl (#11469)
  • Avoid cyclic import in dask.array @hendrikmakait (#11472)
  • Unskip daraframe test @phofl (#11471)
  • Improve dask.order performance for large graphs @fjetter (#11466)
  • Ensure that slice(None) just maps the keys @phofl (#11450)
  • Fix Task.__repr__() of unpickled object @pentschev (#11463)
  • Use TaskSpec in local dask execution @fjetter (#11378)
  • Adjust accuracy in test_solve_triangular_vector @fjetter (#11461)
  • Update Aggregation docstring to better reflext the input argument of … @guillaumeeb (#11459)
  • Implement fuse option for delayed objects @phofl (#11441)
  • Deprecate legacy dask dataframe implementation @phofl (#11437)
  • Fix na casting behavior for groupby.agg with arrow dtypes @phofl (#11118)
  • Fix behavior of keys_in_tasks for TaskSpec nodes @fjetter (#11445)
  • Convert dtype to int instead of np.uint8 for visualising large task graphs @phofl (#11440)
  • TaskSpec: Ensure dependencies are not mutated @fjetter (#11438)
  • Full support for task spec in dask.order @fjetter (#11347)

See the Changelog for more information.

- Python
Published by github-actions[bot] over 1 year ago

dask - 2024.10.0

Changes

  • Ensure broadcast_shapes() returns integers, not NumPy scalars. @trexfeathers (#11434)
  • (fix): sparse indexing @ilan-gold (#11430)
  • Task Spec: Ensure arrays are allowed as arguments @fjetter (#11432)
  • Ensure that recursively calling tokenize respects ensure_deterministic @fjetter (#11431)
  • Task spec: ensure kwargs can have dependencies @fjetter (#11429)
  • Explicitly list setuptools as a build dependency in conda recipe @charlesbluca (#11427)
  • Zarr-Python 3 compatibility @jhamman (#11388)
  • Avoid exponentially increasing taskgraph in overlap @phofl (#11423)
  • Unxfail fixed test @phofl (#11424)
  • Ensure numba tokenization does not use slow pickle path @fjetter (#11419)
  • Tasks - Remove sequence dict classes @fjetter (#11377)
  • Bump JamesIves/github-pages-deploy-action from 4.6.4 to 4.6.8 @dependabot (#11408)
  • Switch from mambaforge to miniforge in CI @jrbourbeau (#11409)

See the Changelog for more information.

- Python
Published by github-actions[bot] over 1 year ago

dask - 2024.9.1

Changes

  • Improve error message for incorrect columns order in meta information @dbalabka (#11393)
  • Update gpuCI RAPIDS_VER to 24.12 @github-actions (#11407)
  • Bump jacobtomlinson/gha-anaconda-package-version from 0.1.3 to 0.1.4 @dependabot (#11405)
  • Switch to using zarr.open\_array instead of using the zarr.Array constructor @jhamman (#11387)

See the Changelog for more information.

- Python
Published by github-actions[bot] over 1 year ago

dask - 2024.9.0

Changes

  • Revert "Improve normalize_chunks calculation for "auto" setting" @jrbourbeau (#11385)
  • Bump peter-evans/create-pull-request from 6 to 7 @dependabot (#11380)
  • Add a Task class to replace tuples for task specification @fjetter (#11248)
  • Reduce overhead in tokenize @fjetter (#11373)
  • Improve normalize_chunks calculation for "auto" setting @phofl (#11354)
  • Bump bokeh minimum version to 3.1.0 @jrbourbeau (#11375)
  • Move tokenize to dedicated submodule @fjetter (#11371)
  • Ensure process_runnables is not too eager in the presence of multiple splits @fjetter (#11367)
  • Use np.min\_scalar\_type in shuffle @jrbourbeau (#11369)
  • Write indexing arrays into dask graph to reduce size for multiple xarray variables @phofl (#11362)
  • Cast indexer to minimal dtype in shuffle @phofl (#11364)
  • Reduce memory usage of dask.order @fjetter (#11361)
  • Bump JamesIves/github-pages-deploy-action from 4.6.3 to 4.6.4 @dependabot (#11366)
  • precommit autoupdate @fjetter (#11360)

See the Changelog for more information.

- Python
Published by github-actions[bot] over 1 year ago

dask - 2024.8.2

Changes

  • Release 2024.8.2 @jrbourbeau (#11359)
  • Add changelor entries for shuffle, vindex and blockwise_reshape @phofl (#11350)
  • Ensure persisted collections are released without GC @fjetter (#11348)
  • Update zoom link for dask meeting @scharlottej13 (#11357)
  • Add more docstring examples for normalize\_chunks @Illviljan (#11271)
  • Choose automatically between tasks-based and p2p rechunking @hendrikmakait (#11337)
  • Implement blockwise reshape @phofl (#11328)
  • Make rechunking in shuffle more intelligent to distribute unevenly if necessary @phofl (#11326)
  • Increase visibility of GPU CI updates @charlesbluca (#11345)
  • Update numpy and pyarrow versions in install docs @jrbourbeau (#11340)
  • Fixup dask and distributed dependencies @phofl (#11338)
  • Bump numpy>=1.24 and pyarrow>=14.0.1 minimum versions @jrbourbeau (#11331)
  • Add crick back to Python 3.11+ CI builds @jrbourbeau (#11335)
  • Preserve chunksizes in vindex @phofl (#11330)
  • Fix dask.array.fft mismatch with Numpy's interface (add support for norm argument) @joanrue (#10665)
  • Pass additional parameters to rechunk_p2p @hendrikmakait (#11319)
  • Fix docstring formatting for map_overlap @Tao-VanJS (#11332)
  • Fix NumPy overflowing for prod on 2.0 @phofl (#11327)
  • tensordot: ensure axes are positive / add tests for negative axes @joanrue (#10812)
  • Fix map_overlap with new_axis @dstansby (#11128)

See the Changelog for more information.

- Python
Published by github-actions[bot] almost 2 years ago

dask - 2024.8.1

Changes

  • Ensure pickle does not change tokens @fjetter (#11320)
  • Add changelog entry for reshape and ordering improvements @phofl (#11324)
  • Rename chunksize-tolerance option @phofl (#11317)
  • Upgrade gpuCI and fix Dask Array failures with "cupy" backend @rjzamora (#11309)
  • Implement automatic rechunking for shuffle @phofl (#11311)
  • Ensure we test against numpy 2 in CI @jrbourbeau (#11182)
  • Revert "Test ordering on distributed scheduler (#11310)" @fjetter (#11321)
  • Test ordering on distributed scheduler @fjetter (#11310)
  • Add tests to cover more cases of new reshape implementation @phofl (#11313)
  • order: Choose better target for branches with multiple leaf nodes @phofl (#11303)
  • order: ensure runnable tasks are certainly runnable @fjetter (#11305)
  • Fix upstream numpy build @phofl (#11304)
  • Make shuffle a no-op if possible @phofl (#11291)
  • Keep chunksize consistent in reshape @phofl (#11273)
  • Enable slicing with only one unknonw chunk @phofl (#11301)
  • Link to dask vs spark benchmarks on dask docs @scharlottej13 (#11289)
  • Fix slicing for masked arrays @phofl (#11300)
  • array: fix asarray for array input with dtype @lucascolley (#11288)
  • array: add constants @lucascolley (#11287)
  • Ignore typing of return value @phofl (#11286)
  • Remove automatic resizing in reshape @phofl (#11269)
  • API: expose np dtypes in dask.array namespace @lucascolley (#11178)
  • Drop support for Python 3.9 @phofl (#11245)

See the Changelog for more information.

- Python
Published by github-actions[bot] almost 2 years ago

dask - 2024.8.0

Changes

  • Add changelog for dask order patch @phofl (#11278)
  • order: add regression test for xarray map reduce @fjetter (#11277)
  • Add changelog entry for take @phofl (#11274)
  • Revert "order: remove data task graph normalization" @phofl (#11276)
  • Use the shuffle algorithm for take @phofl (#11267)
  • Implement task-based array shuffle @phofl (#11262)
  • order: remove data task graph normalization @fjetter (#11263)
  • Update zoom link for monthly meeting @scharlottej13 (#11265)
  • Update data loading section of best practices @phofl (#11247)
  • Match default chunksize in docstring to actual default set in code @SwamyDev (#11254)
  • Fixup casting error in pandas 3 @phofl (#11250)
  • Skip new warning from pandas @phofl (#11249)
  • Fix pandas nightly bugs @phofl (#11244)

See the Changelog for more information.

- Python
Published by github-actions[bot] almost 2 years ago

dask - 2024.7.1

Changes

  • Remove and warn of persist usage @phofl (#11237)
  • Preserve timestamp unit during meta creation @phofl (#11233)
  • Ensure that dask expr DataFrames are optimized when put into delayed @phofl (#11231)
  • Fixes for d freq deprecation in pandas=3 @jrbourbeau (#11228)
  • bump approx threshold for test_quantile @fjetter (#10720)
  • Bump xarray-contrib/issue-from-pytest-log from 1.2.8 to 1.3.0 @dependabot (#11221)
  • Bump JamesIves/github-pages-deploy-action from 4.6.1 to 4.6.3 @dependabot (#11222)

See the Changelog for more information.

- Python
Published by github-actions[bot] almost 2 years ago

dask - 2024.7.0

Changes

  • Only count data that is in memory for xarray sizeof @fjetter (#11206)
  • Fix botocore re-raising error @phofl (#11209)
  • Update Coiled links in documentation @scharlottej13 (#11211)
  • Add some array-expr methods @phofl (#11210)
  • Fix quantile for arrow dtypes @phofl (#11202)
  • Add utility to verify optional dependencies @phofl (#11205)
  • Implement array expression switch @phofl (#11203)
  • Drop support for pandas<2 @phofl (#11199)
  • Remove no longer supported ipython reference @phofl (#11196)
  • Remove from_delayed references @phofl (#11195)
  • Add other IO connectors to docs @phofl (#11189)

See the Changelog for more information.

- Python
Published by github-actions[bot] almost 2 years ago

dask - 2024.6.2

Changes

  • Get docs build passing @jrbourbeau (#11184)

See the Changelog for more information.

- Python
Published by github-actions[bot] almost 2 years ago

dask - 2024.6.1

Changes

  • Cache global query-planning config @rjzamora (#11183)
  • Python 3.13 fixes @AdamWill (#11185)
  • Fix test\_map\_freq\_to\_period\_start for pandas=3 @jrbourbeau (#11181)

See the Changelog for more information.

- Python
Published by github-actions[bot] almost 2 years ago

dask - 2024.6.0

Changes

  • Fix test\_dt\_accessor with query planning disabled @jrbourbeau (#11177)
  • Use packaging.version.Version @jrbourbeau (#11171)
  • Remove deprecated dask.compatibility module @jrbourbeau (#11172)
  • Ensure compatibility for xarray.NamedArray @hendrikmakait (#11168)
  • Estimate sizes of xarray collections @fjetter (#11166)
  • Add section about futures and variables @fjetter (#11164)
  • Update docs for combined dask community meeting info @scharlottej13 (#11159)
  • Ensure tokenization of memmap doesn't materialize array in memory @fjetter (#11161)

See the Changelog for more information.

- Python
Published by github-actions[bot] almost 2 years ago

dask - 2024.5.2

Changes

  • Fix nightly Zarr installation in CI @jrbourbeau (#11151)
  • Add python 3.11 build to GPU CI @charlesbluca (#11135)
  • Update gpuCI RAPIDS_VER to 24.08 @github-actions (#11141)
  • Update test\_groupby\_grouper\_dispatch @rjzamora (#11144)
  • Bump JamesIves/github-pages-deploy-action from 4.6.0 to 4.6.1 @dependabot (#11136)
  • Unskip test\_array\_function\_sparse with new sparse release @jrbourbeau (#11139)
  • Fix test\_parse\_dates\_multi\_column on pandas=3 @jrbourbeau (#11132)
  • Don't draft release notes for tagged commits @jacobtomlinson (#11138)

See the Changelog for more information.

- Python
Published by github-actions[bot] about 2 years ago

dask - 2024.5.1

Changes

  • Minor updates to ML page @jrbourbeau (#11129)
  • Skip failing sparse test on 0.15.2 @jrbourbeau (#11131)
  • Make sure nightly pyarrow is installed in upstream CI build @jrbourbeau (#11121)
  • Add initial draft of ML overview document @mrocklin (#11114)
  • Allow non-memory zarr stores in to\_zarr with distributed @GFleishman (#10422)
  • Test query-planning in gpuCI @rjzamora (#11060)
  • Avoid pytest error when skipping NumPy 2.0 tests @jrbourbeau (#11110)
  • Use nightly h5py in upstream CI build @jrbourbeau (#11108)
  • Use nightly scikit-image in upstream CI build @jrbourbeau (#11107)
  • meshgrid and atleast\_\*d NumPy 2 updates @jrbourbeau (#11106)
  • Bump actions/checkout from 4.1.4 to 4.1.5 @dependabot (#11105)
  • Enable partquet append tests after fix @phofl (#11104)
  • Skip fastparquet tests for numpy 2 @phofl (#11103)
  • Fix misspelling found by codespell @DimitriPapadopoulos (#11097)
  • Fix doc build @phofl (#11099)
  • Clean up percentiles\_summary logic @rjzamora (#11094)
  • Apply ruff/flake8-implicit-str-concat rule ISC001 @DimitriPapadopoulos (#11098)
  • update broadcast array for numpy 2 @quasiben (#11096)

See the Changelog for more information.

- Python
Published by github-actions[bot] about 2 years ago

dask - 2024.5.0

Changes

  • DOC: intersphinx, don't link to click dev version. @Carreau (#11091)
  • Fix API doc links for some dask-expr expressions @phofl (#11092)
  • Add dask-expr to upstream build @phofl (#11086)
  • Add melt support when query-planning is enabled @rjzamora (#11088)
  • skip dataframe/product when in numpy 2 envs @quasiben (#11089)
  • Add plots to illustrate what the optimizer does @phofl (#11072)
  • Fixup pandas upstream tests @phofl (#11085)
  • Bump conda-incubator/setup-miniconda from 3.0.3 to 3.0.4 @dependabot (#11084)
  • Bump actions/checkout from 4.1.3 to 4.1.4 @dependabot (#11083)
  • Fix ci after pytest changes @phofl (#11082)
  • Fixup tests for more efficient dask-expr implementation @phofl (#11071)
  • Generalize clear\_known\_categories utility @rjzamora (#11059)
  • Bump JamesIves/github-pages-deploy-action from 4.5.0 to 4.6.0 @dependabot (#11062)
  • Bump release-drafter/release-drafter from 5 to 6 @dependabot (#11063)
  • Bump actions/checkout from 4.1.2 to 4.1.3 @dependabot (#11061)
  • Update GPU CI RAPIDS_VER to 24.06, disable query planning @charlesbluca (#11045)

See the Changelog for more information.

- Python
Published by github-actions[bot] about 2 years ago

dask - 2024.4.2

Changes

  • Add GitHub Releases automation @jacobtomlinson (#11057)
  • Add changelog entries for new release @phofl (#11058)
  • Reinstate try/except block in _bind_property @wence- (#11049)
  • Fix link for query planning docs @phofl (#11054)
  • Add config parameter for parquet file size @phofl (#11052)
  • doc: Update percentile docstring @bzah (#11053)
  • Add docs for query optimizer @phofl (#11043)
  • Assignment of np.ma.masked to obect-type Array @davidhassell (#9627)
  • Don't error if dask_expr is not installed @Hoxbro (#11048)
  • Adjust test_set_index for "cudf" backend @rjzamora (#11029)
  • Use to/from\_legacy\_dataframe instead of to/from\_dask\_dataframe @rjzamora (#11025)
  • Tokenize bag groupby keys @cisaacstern (#10734)
  • Add lazy "cudf" registration for p2p-related dispatch functions @rjzamora (#11040)

See the Changelog for more information.

- Python
Published by github-actions[bot] about 2 years ago