Recent Releases of https://github.com/mars-project/mars

https://github.com/mars-project/mars - v0.10.0

What's Changed

  • Optimize tile of DataFrame.setitem by reducing time of generating chunk meta by @qinxuye in https://github.com/mars-project/mars/pull/3140
  • Increase the default value of alru cache max size by @zhongchun in https://github.com/mars-project/mars/pull/3146
  • Support scipy special function with tuple output by @RandomY-2 in https://github.com/mars-project/mars/pull/3139
  • Fix DAG.to_dot when reducers have multiple outputs by @chaokunyang in https://github.com/mars-project/mars/pull/3150
  • Fix deserializing RandomStateField when its value is None by @chaokunyang in https://github.com/mars-project/mars/pull/3149
  • Patch pandas magic functions to allow reverse operands by @wjsi in https://github.com/mars-project/mars/pull/3155
  • Run flaky test test_load_third_party_modules separately by @chaokunyang in https://github.com/mars-project/mars/pull/3162
  • Manually install cri-dockerd before installing kubernetes by @wjsi in https://github.com/mars-project/mars/pull/3166
  • [Shuffle] Add n_mappers and n_reducers to ShuffleProxy by @chaokunyang in https://github.com/mars-project/mars/pull/3160
  • [Ray] task based shuffle for ray by @chaokunyang in https://github.com/mars-project/mars/pull/3040
  • Add support for {DataFrame,Series}.align by @wjsi in https://github.com/mars-project/mars/pull/3147
  • Integrate remaining error functions and fresnel integrals except fresnel_zeros by @RandomY-2 in https://github.com/mars-project/mars/pull/3172
  • Improve numexpr fusion by @fyrestone in https://github.com/mars-project/mars/pull/3177
  • Ensure key is a valid Python identifier by @fyrestone in https://github.com/mars-project/mars/pull/3190
  • Bump terser from 5.7.1 to 5.14.2 in web component by @dependabot in https://github.com/mars-project/mars/pull/3194
  • Implement airy functions (except the ai_zeros and bi_zeros functions) by @shantam-8 in https://github.com/mars-project/mars/pull/3195
  • Disable version updates for dependabot by @wjsi in https://github.com/mars-project/mars/pull/3203
  • [Ray] Fix ray memory leak by @fyrestone in https://github.com/mars-project/mars/pull/3184
  • [Ray] Support reducer has inputs which isn't mapper by @chaokunyang in https://github.com/mars-project/mars/pull/3206
  • Refine UT and logs by @fyrestone in https://github.com/mars-project/mars/pull/3204
  • release actor lock when setsubtaskresult by @chaokunyang in https://github.com/mars-project/mars/pull/3210
  • Refine apply key generation by @chaokunyang in https://github.com/mars-project/mars/pull/3208
  • fix remove mapper data by @chaokunyang in https://github.com/mars-project/mars/pull/3214
  • [Ray] Configurable subtask num_cpus by @fyrestone in https://github.com/mars-project/mars/pull/3207
  • Fix versionner compatibility with PEP600 by @chaokunyang in https://github.com/mars-project/mars/pull/3223
  • Support get mappers data without index/mapperids by @chaokunyang in https://github.com/mars-project/mars/pull/3222
  • [Ray] RayExecutionContext.getchunkmeta from meta service by @fyrestone in https://github.com/mars-project/mars/pull/3212
  • [Ray] Share RayTaskState across tasks by @fyrestone in https://github.com/mars-project/mars/pull/3219
  • [Shuffle] Support shuffle operands mapper whose outputs aren't mapper blocks by @chaokunyang in https://github.com/mars-project/mars/pull/3228
  • Apply Operand Closure clean up by @vcfgv in https://github.com/mars-project/mars/pull/3205
  • Fix dataframe sort_values with multiple ascendings bug in pandas < 1.4 by @fyrestone in https://github.com/mars-project/mars/pull/3234
  • Lifecycle gc task service by @fyrestone in https://github.com/mars-project/mars/pull/3230
  • Fix dataframe loc with slice returns incorrect results by @fyrestone in https://github.com/mars-project/mars/pull/3241
  • Fix dataframe setitem bugs when partial indexes exist in target dataframe by @fyrestone in https://github.com/mars-project/mars/pull/3240
  • [Shuffle] isolate mappers in different subtasks for fetchbyindex mode by @chaokunyang in https://github.com/mars-project/mars/pull/3239
  • TypeDispatcher support one type multiple serializers by @fyrestone in https://github.com/mars-project/mars/pull/3242
  • [Shuffle] Skip store shuffle object refs to reduce meta overhead by @chaokunyang in https://github.com/mars-project/mars/pull/3209
  • [ray] Support scheduling ray tasks in Ray oscar deploy backend by @chaokunyang in https://github.com/mars-project/mars/pull/3165
  • Dump subtask graph for all backends by @fyrestone in https://github.com/mars-project/mars/pull/3245
  • [Metrics] Fix metrics and docs by @zhongchun in https://github.com/mars-project/mars/pull/3233
  • Remove storage service from supervisor by @vcfgv in https://github.com/mars-project/mars/pull/3254
  • Fix optimization rule memory leak by @fyrestone in https://github.com/mars-project/mars/pull/3246
  • fsspec integration by @hekaisheng in https://github.com/mars-project/mars/pull/3253
  • [Ray] Enable CI of mars/dataframe for Ray DAG by @fyrestone in https://github.com/mars-project/mars/pull/3250
  • Fix minikube installation by @hekaisheng in https://github.com/mars-project/mars/pull/3244
  • Implements scipy.stats.rankdata by @shantam-8 in https://github.com/mars-project/mars/pull/3218
  • Add S3 support by @fyrestone in https://github.com/mars-project/mars/pull/3258
  • Fix tensor frexp by @fyrestone in https://github.com/mars-project/mars/pull/3259
  • Optimize the display of task process bar by @zhongchun in https://github.com/mars-project/mars/pull/3264
  • [Ray] Optimize ray executor submit subtask by @fyrestone in https://github.com/mars-project/mars/pull/3271
  • [Ray] Enable CI of mars/learn for Ray DAG by @fyrestone in https://github.com/mars-project/mars/pull/3261
  • [Ray] Enable CI of mars/tensor for Ray DAG by @fyrestone in https://github.com/mars-project/mars/pull/3275
  • Compatible with pandas 1.5.0 by @hekaisheng in https://github.com/mars-project/mars/pull/3276
  • Remove skipraydag mark for raydataset tests by @vcfgv in https://github.com/mars-project/mars/pull/3255
  • MapChunk Operand Closure and Callable cleanup by @vcfgv in https://github.com/mars-project/mars/pull/3238
  • [Ray] Spread scheduling subtasks with empty dependencies by @fyrestone in https://github.com/mars-project/mars/pull/3281
  • Speedup mars deserialization by new by @chaokunyang in https://github.com/mars-project/mars/pull/3283
  • A cython-based ordered_set to speedup discard operation by @chaokunyang in https://github.com/mars-project/mars/pull/3277
  • Optimize concat by @fyrestone in https://github.com/mars-project/mars/pull/3286
  • Fix md.concat error when there are same fetch chunk data by @zhongchun in https://github.com/mars-project/mars/pull/3285
  • [Ray] Improve Ray executor GC by @fyrestone in https://github.com/mars-project/mars/pull/3287
  • Fix some CI issues by @hekaisheng in https://github.com/mars-project/mars/pull/3296
  • [Ray] Implement Ray executor subtask GC by @fyrestone in https://github.com/mars-project/mars/pull/3294
  • [Ray] Add metrics for Ray executor by @fyrestone in https://github.com/mars-project/mars/pull/3295
  • Bump up required vineyard version to address the CI failure. by @sighingnow in https://github.com/mars-project/mars/pull/3298
  • [Operand] support loc setitem by @chaokunyang in https://github.com/mars-project/mars/pull/3291
  • [Ray] Support worker_mem for ray executor by @fyrestone in https://github.com/mars-project/mars/pull/3300
  • Fix duplicate execution by @fyrestone in https://github.com/mars-project/mars/pull/3301
  • Fix CI by @hekaisheng in https://github.com/mars-project/mars/pull/3306
  • [Ray] Basic slow subtask detection by @fyrestone in https://github.com/mars-project/mars/pull/3305
  • Fix stats tests and pin sphinx version by @hekaisheng in https://github.com/mars-project/mars/pull/3313
  • Fix s3 client kwargs by @fyrestone in https://github.com/mars-project/mars/pull/3316
  • Update Mars on Ray doc by @fyrestone in https://github.com/mars-project/mars/pull/3311

Full Changelog: https://github.com/mars-project/mars/compare/v0.10.0a1...v0.10.0

- Python
Published by fyrestone over 3 years ago

https://github.com/mars-project/mars - v0.9.0

This is the release notes of v0.9.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.9.0rc3; for all highlights and changes, please refer to the release notes of the pre-releases:

alpha1 alpha2 beta1 beta2 rc1 rc2 rc3

Changes that break compatibility

From v0.9 on, Python 3.6 is dropped support.

Highlights

  • Performance is fully optimized in this version, welcome to give your feedback.

New Features

  • Oscar
    • Stop importing main module when starting Mars local cluster (#3113)
  • Tensor
    • Integrate special error functions (#3062)
    • Integrate part of scipy elliptic functions and integrals (#3112)
  • DataFrame
    • Support sort=True for Groupby (#3063, thanks @sak2002!)

Enhancements

  • Dump remote tracebacks to make local ones more friendly (#3030)
  • Optimize import speed for Mars package (#3035)
  • [Ray] Implement ray task executor progress (#3065)
  • Shuffle both sides at the same time for md.merge (#3066)
  • Refine ThreadedServiceContext.getchunksmeta usage (#3067)
  • Do not aggressively choose tree method in tile of groupby for distributed setting (#3070)
  • Disable bloom filter in merge for now (#3071)
  • [Ray] Implements getchunksresult for Ray execution context (#3072)
  • Use tell when remove mapper data after execution (#3073)
  • Assign reducer ops in task assigner to make them more balanced across cluster (#3075)
  • [Ray] Destroy Ray executor when the task finish (#3074)
  • Combine tree and shuffle methods in DataFrameGroupBy.agg tile (#3077)
  • [Ray] Implements getchunksmeta for Ray execution context (#3076)
  • Use OS-designated ports instead of random ports to create sub pools (#3087)
  • Call immutable web API only once when previous call blocks (#3088)
  • Unify DataFrameGroupByAgg's tile logic for auto method (#3094)
  • [Ray] Support basic subtask retry and lineage reconstruction (#3097)
  • Simplify argument passing in actor batch calls (#3100)
  • [Ray] Implements gettotaln_cpu for Ray execution context (#3104)
  • Optimize performance of transfer (#3105)
  • Add n_reducers and reducer_ordinal to shuffle operands (#3107)
  • [Ray] Implement cancel method on Ray task executor (#3093)
  • [Ray] Create RayTaskState actor as needed by default (#3114)
  • [Ray] Implement gc for ray task executor context (#3116)
  • Optimize serializable memory (#3126)

Bug fixes

  • Patch pandas to make pickle compatible between 1.2 and 1.3 (#3050)
  • Fix errors when deleting mapper data (#3064)
  • Fix chunk index error in automergechunks (#3068)
  • Fix recursive_tile that it may cause duplicated tile for one tileable (#3069)
  • [Ray] Fix ray worker failover (#3115)
  • [Ray] Fix pandas schema parsing when reading Ray dataset (#3117)
  • [Ray] fix auto scale-in hang (#3125)
  • [Metric] Fix prometheus metric backend (#3127)
  • Fix mt.{cumsum, cumprod} when the first chunk is empty (#3136)

Tests

  • Check initialization of serializables on CI (#3013)
  • [Ray] Optimize Ray CI execution time and stability (#3121)
  • Update pytest imports for test_special.py (#3131)
  • [Ray] Fix flaky test testoptionalsupervisor_node (#3135)

Others

  • Build web code before CIBW when deploying to PyPI (#3016)

- Python
Published by qinxuye about 4 years ago

https://github.com/mars-project/mars - v0.10.0a1

This is the release notes of v0.10.0a1. See here for the complete list of solved issues and merged PRs.

New Features

  • Oscar
    • Stop importing main module when starting Mars local cluster (#3110)
  • Tensor
    • Integrate special error functions (#3060)
    • Integrate part of scipy elliptic functions and integrals (#3111)
  • DataFrame
    • Support sort=True for Groupby (#2959, thanks @sak2002!)

Enhancements

  • Disable bloom filter in merge for now (#2967)
  • [Ray] Implement ray task executor progress (#3008)
  • Dump remote tracebacks to make local ones more friendly (#3028)
  • Use tell when remove mapper data after execution (#3027)
  • Optimize import speed for Mars package (#3022)
  • Do not aggressively choose tree method in tile of groupby for distributed setting (#3032)
  • [Ray] Implements getchunksresult for Ray execution context (#3023)
  • Refine ThreadedServiceContext.getchunksmeta usage (#3037)
  • Shuffle both sides at the same time for md.merge (#3041)
  • Assign reducer ops in task assigner to make them more balanced across cluster (#3048)
  • [Ray] Destroy Ray executor when the task finish (#3049)
  • [Ray] Implements getchunksmeta for Ray execution context (#3052)
  • [Ray] Support basic subtask retry and lineage reconstruction (#2969)
  • Combine tree and shuffle methods in DataFrameGroupBy.agg tile (#3051)
  • [Ray] Implements gettotaln_cpu for Ray execution context (#3059)
  • [Ray] Implement cancel method on Ray task executor (#3044)
  • Use OS-designated ports instead of random ports to create sub pools (#3053)
  • Unify DataFrameGroupByAgg's tile logic for auto method (#3084)
  • Simplify router clean up when pools or clusters ends (#3086)
  • Call immutable web API only once when previous call blocks (#3085)
  • [Ray] Create RayTaskState actor as needed by default (#3081)
  • [Ray] Implement gc for ray task executor context (#3061)
  • Simplify argument passing in actor batch calls (#3098)
  • Optimize performance of transfer (#3091)
  • Add n_reducers and reducer_ordinal to shuffle operands (#3055)
  • Optimize serializable memory (#3120)

Bug fixes

  • Fix errors when deleting mapper data (#3018)
  • Fix recursive_tile that it may cause duplicated tile for one tileable (#3021)
  • Fix error message when sparse data format not supported (#3046)
  • Patch pandas to make pickle compatible between 1.2 and 1.3 (#3047)
  • Fix chunk index error in automergechunks (#3057)
  • [Ray] Fix ray worker failover (#3080)
  • [Metric] Fix prometheus metric backend (#3124)
  • Fix mt.{cumsum, cumprod} when the first chunk is empty (#3134)

Tests

  • Check initialization of serializables on CI (#3007)
  • Use @pytest_asyncio.fixture instead of @pytest.fixture for async fixtures (#3025)
  • Change code owners to Mars PMC maintainers (#3031)
  • [Ray] Fix ray executor progress test (#3033)
  • [Ray] Optimize Ray CI execution time and stability (#3102)
  • Make testsessionset_progress more stable under Ray tests (#3103)
  • Update pytest imports for test_special.py (#3129)
  • [Ray] Fix flaky test test_optional_supervisor_node (#3133)

Others

  • Build web code before CIBW when deploying to PyPI (#3014)
  • Make PyPI user name configurable (#3130)

- Python
Published by qinxuye about 4 years ago

https://github.com/mars-project/mars - v0.8.7

This is the release notes of v0.8.7.

Bug fixes

  • Fixes missing web packages in Linux wheels (#3014)

- Python
Published by wjsi about 4 years ago

https://github.com/mars-project/mars - v0.8.6

This is the release notes of v0.8.6. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implementing Ellipsoidal Harmonics Functions (#2927, thanks @shantam-8!)

Enhancements

  • Add support for dask.persist (#2990, thanks @loopyme!)
  • Optimize gen subtask graph (#3006)
  • Ignore broadcaster's locality when assign subtasks (#2994)

Bug fixes

  • Fix task hang when error object cannot be pickled (#2913)
  • Fix potential KeyError in actor_ref calls when running with multiple processes (#2962)
  • Wrap errors in operand execution to protect scheduling service (#2971)
  • Fix dtype of series result for DataFrame.apply (#2979)
  • Fix default config to ensure storage backends configured (#2989)
  • Fix potential empty chunks when creating DataFrame from pandas (#2991)
  • Fix incorrect result for df.sort_values when specifying multiple ascending (#3006)
  • Fix missing extra_params when constructing operands (#3006)

Tests

  • Fix version mismatch between kubernetes and minikube (#2988)

- Python
Published by qinxuye about 4 years ago

https://github.com/mars-project/mars - v0.9.0rc3

This is the release notes of v0.9.0rc3. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implementing Ellipsoidal Harmonics Functions (#2891, thanks @shantam-8!)
  • Services
    • Support worker meta service (#2909)
    • Basic Ray execution backend (#2921)

Enhancements

  • Add execution API to enable custimization of Mars Task Service (#2894)
  • Optimize serialization performance (#2914)
  • Skip adding band in meta when fetch shuffle data (#2922)
  • Store complete meta on worker and update supervisor meta via fetching from workers (#2912)
  • Use cython to accelerate core serialization (#2924)
  • Refine lifecycle api to support incref or decref with ref counts (#2926)
  • Ignore fetch operands when assign initial nodes (#2929)
  • Use cython to accelerate message serialization (#2932)
  • Ignore broadcaster's locality when assign subtasks (#2943)
  • Allow spawning serialization to threads for large objects (#2944)
  • Add metrics and event report for Ray channels (#2936)
  • Add more logs about execution info (#2940)
  • Add support for dask.persist (#2953, thanks @loopyme!)
  • Remove should_be_monotonic property (#2949)
  • Add metrics on operand and subtask executions (#2947, thanks @zhongchun!)
  • [Ray] optimize ray fetcher by query in remote node (#2957)
  • Improve deploy backend (#2958)
  • Support reporting tile progress (#2954)
  • Add logic key for tileable graph (#2961, thanks @zhongchun!)
  • [Ray] Loads the subtask inputs from meta (#2976)
  • New ExecutionConfig API (#2968)
  • Fix speculative execution compatibility with coloring (#2995)
  • Make functions that may take long run in thread for lifecycle tracker (#2992)
  • Optimize metric configs (#2996, thanks @zhongchun!)
  • Expand the ability of resource evaluator (#2997, thanks @zhongchun!)
  • Optimize gen subtask graph (#3004)
  • [Ray] Ray execution state (#3002)

Bug fixes

  • Fix paramter issue of worker actor pool (#2911, thanks @zhongchun!)
  • Fix default config to ensure storage backends configured (#2935)
  • Wrap errors in operand execution to protect scheduling service (#2964)
  • Fix dtype of series result for DataFrame.apply (#2978)
  • Fix potential data leak for shuffle tasks (#2975)
  • Fix potential empty chunks when creating DataFrame from pandas (#2987)
  • [Ray] Support new ray cluster through ray client (#2981)
  • Fix missing extra_params when constructing operands (#2999)
  • Fix msg_to_simple_str in Ray backend and add tests (#3003)
  • Fix incorrect result for df.sort_values when specifying multiple ascending (#2984)

Documentation

  • Add development documents for metrics (#2955, thanks @zhongchun!)

Tests

  • Add TPC-H benchmarks (#2937)
  • Fix Ray cases (#2983)
  • Fix version mismatch between kubernetes and minikube (#2986)
  • Allow selecting TPC queries (#3005)

- Python
Published by qinxuye about 4 years ago

https://github.com/mars-project/mars - v0.8.5

This is the release notes of v0.8.5. See here for the complete list of solved issues and merged PRs.

New Features

  • Web
    • Add stack display page on Mars Web (#2881)

Enhancements

  • Avoid printing too many messages in Oscar (#2880)
  • [Ray] Use main pool as owner when autoscale disabled (#2903)

Bug fixes

  • Fix XGBoost when some workers do not have evals data (#2863)
  • Raise ActorNotExist when no supervisors available (#2869)
  • Fix dtype infer in DataFrame arithmetic on datetime consts (#2880)
  • Fix duplicate node iteration in GraphAssigner (#2880)
  • Fix timeout for wait_task (#2890)
  • Make sure errors can be raised in Actor.__pre_destroy__ (#2892)

Tests

  • Upgrade azure-pipelines to Python 3.9 (#2886)
  • Adapt to official cancel of Github Actions (#2903)

- Python
Published by qinxuye about 4 years ago

https://github.com/mars-project/mars - v0.9.0rc2

This is the release notes of v0.9.0rc2. See here for the complete list of solved issues and merged PRs.

New Features

  • Web
    • Add stack display page on Mars Web (#2876)

Enhancements

  • Avoid printing too many messages in Oscar (#2871)
  • Expand slot scheduler to resource scheduler (#2846, thanks @zhongchun!)
  • Optimized iterative tiling by pruning unrelated chunks (#2874)
  • Optimize DataFrameIsin's tile (#2864)
  • Add benchmark for serialization (#2901)
  • [Ray] Ray client channel get recv when first complied (#2740, thanks @Catch-Bull!)
  • Use bloom filter to optimize df.merge execution (#2895)
  • Stop recording all mapper meta (#2900)
  • [Ray] Use main pool as owner when autoscale disabled (#2878)

Bug fixes

  • Fix XGBoost when some workers do not have evals data (#2861)
  • Fix duplicate node iteration in GraphAssigner (#2857)
  • Raise ActorNotExist when no supervisors available (#2859)
  • Fix dtype infer in DataFrame arithmetic on datetime consts (#2879)
  • Fix timeout for wait_task (#2883)
  • Make sure error can be raised in Actor.__pre_destroy__ (#2887)

Tests

  • Upgrade azure-pipelines to Python 3.9 (#2862)
  • Adapt to official cancel of Github Actions (#2902)

- Python
Published by qinxuye about 4 years ago

https://github.com/mars-project/mars - v0.9.0rc1

This is the release notes of v0.9.0rc1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mars.tensor.setdiff1d (#2823)
  • Learn
    • Added support for mars.learn.metrics.roc_auc_score (#2832)
  • Services
    • A speculative execution based task scheduler (#2576)
  • Metric
    • [ray] Add metric for ray object store (#2776, thanks @Catch-Bull!)
  • Others
    • Use versioneer to manage release versions (#2806)

Enhancements

  • Support generating a DOT file for subtask graph (#2803)
  • Support generating dtypes, index_value etc lazily for DataFrame chunks (#2756)
  • [ray] Default enable fault tolerance for ray (#2801)
  • Improve subtask details in logs (#2836)
  • Accurate resource management for global slot manager (#2732)
  • Configure nthread of XGBoost jobs (#2844)
  • Improved performance of mars.learn.metrics.{roc_curve, roc_auc_score} (#2838)
  • Bump minimist and nanoid in Mars UI due to security alerts (#2849)
  • Fix store duplicate chunk and meta per subtask (#2845)

Bug fixes

  • Fix default value of gpu property for some operands (#2811)
  • Fixes the failure on Vineyard CI by ensure the input tensor chunk is a numpy's ndarray (#2817)
  • Fix race condition of set_subtask_result (#2784)
  • Fix duplicate subtask submit (#2815)
  • Change StorageHandlerActor to stateful (#2824)
  • Fix running xgboost on Ray cluster (#2826)
  • Fix FileSystem.ls for OSS (#2837)
  • Stop fetching data when pure dependencies specified (#2840)
  • Fix dirty version number caused by versioneer when building with cibuildwheel (#2855)

Tests

  • [Ray] Refine ray tests (#2793)
  • Build docker images cronically (#2804)
  • Introduce asv benchmark (#2798)

- Python
Published by wjsi about 4 years ago

https://github.com/mars-project/mars - v0.8.4

This is the release notes of v0.8.4. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mars.tensor.setdiff1d (#2829)
  • Learn
    • Added support for mars.learn.metrics.roc_auc_score (#2841)
  • Others
    • Use versioneer to manage release versions (#2807)
    • Use cibuildwheel to release wheels (#2854)

Enhancements

  • Support generating a DOT file for subtask graph (#2818)
  • Enhance subtask details in logs (#2842)
  • Configure cores of XGBoost jobs (#2847)
  • Improved performance of mars.learn.metrics.{roc_curve, roc_auc_score} (#2850)
  • Fix store duplicate chunk and meta per subtask (#2851)
  • Bump minimist and nanoid in Mars UI due to security alerts (#2851)

Bug fixes

  • Fix race condition of setsubtaskresult (#2819)
  • Fix duplicate subtask submit (#2819)
  • Fixes the failure on Vineyard CI by ensure the input tensor chunk is a numpy's ndarray (#2819)
  • Fix default value of gpu property for some operands (#2820)
  • Fix running xgboost on Ray cluster (#2830)
  • Change StorageHandlerActor to stateful (#2830)
  • Fix FileSystem.ls for OSS (#2842)
  • Stop fetching data when pure dependencies specified (#2843)

Tests

  • [Ray] Refine ray tests (#2810)
  • Build docker images cronically (#2807)

- Python
Published by wjsi about 4 years ago

https://github.com/mars-project/mars - v0.8.3

This is the release notes of v0.8.3. See here for the complete list of solved issues and merged PRs.

Enhancements

  • Stop inferring outputs when args provided (#2761)
  • Remove deprecate warnings when import mars.tensor (#2790)
  • [Ray] New ray actor creation model (#2794)

Bug fixes

  • Fix long exception of asyncio.gather (#2753)
  • Fix wrong result of df.merge (#2777)
  • Fix DataFrame initializer when Mars object exists in list (#2778)
  • Fix duplicate dec object ref (#2789, thanks @Catch-Bull!)
  • [Ray] Support Ray client mode (#2796)

Tests

  • Increase test stability for command-line tests (#2786)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.9.0b2

This is the release notes of v0.9.0b2. See here for the complete list of solved issues and merged PRs.

New Features

  • Metric
    • Add metric framework (#2742, thanks @zhongchun!)
    • Add prometheus metric implementation (#2752, thanks @zhongchun!)
    • Add ray metrics implementation (#2749, thanks @zhongchun!)
    • Add common metrics (#2760, thanks @zhongchun!)

Enhancements

  • Simplify rechunk implementation (#2745)
  • Stop inferring outputs when args provided (#2759)
  • Add broadcast merge support for DataFrame (#2772)
  • Remove deprecate warnings when import mars.tensor (#2788)
  • Optimize in-process actor calls (#2763)
  • [ray] New ray actor creation model (#2783)

Bug fixes

  • Fix duplicate dec object ref (#2741, thanks @Catch-Bull!)
  • Fix long exception of asyncio.gather (#2748)
  • Fix NameError: name 'pq' is not defined if pyarrow is not installed (#2751)
  • Fix profiling bandsubtasks and mostcalls are empty if the slow duration is large (#2755)
  • Fix the wrong result of df.merge (#2774)
  • Fix DataFrame initializer when Mars object exists in list (#2770)
  • [ray] support ray client mode (#2773)

Tests

  • Increase test stability for command-line tests (#2779)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.8.2

This is the release notes of v0.8.2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Support inclusive argument for pd.date_range (#2721)

Enhancements

  • Optimize eval-setitem expressions as single eval expressions (#2699)
  • [Ray] Refine raydataset integration (#2712)
  • [Ray] refine ray dataset integration (#2726)
  • Add support for reading partitioned parquet for fastparquet (#2729)
  • Fix duplicate exceptions in log (#2736)

Bug fixes

  • Fix sort_values for empty DataFrame or Series (#2686)
  • Eliminate redundant eval node in optimization (#2688)
  • Avoid iterative tiling for df.loc[:, fields] (#2689)
  • Fix use_arrow_dtype parameter for read_parquet (#2702)
  • Fix error on dependent DataFrame setitems (#2703)
  • Fix estimate_pandas_size on pd.MultiIndex (#2710)
  • Import vineyard.data.pickle to make members available (#2716)
  • Fix shuffle when ndim of input tensors are different (#2728)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.9.0b1

This is the release notes of v0.9.0b1. See here for the complete list of solved issues and merged PRs.

Highlights

  • A new coloring-based fusion algorithm is introduced in #2719, performance is expected to have a significant increase compared to previous releases, however, some unexpected situations may happen, feel free to reach out to us if you find any.

New Features

  • DataFrame
    • Support inclusive argument for pd.date_range (#2718)
  • Others
    • Add cibuildwheel with Linux AArch64 wheel build support (#2672, thanks @odidev!)

Enhancements

  • Refine failure recovery log and exception (#2633)
  • Optimize eval-setitem expressions as single eval expressions (#2695)
  • Auto merge small chunks when df.groupby().apply(func) is doing aggregation (#2708)
  • Optimize GroupBy's aggregation algorithm (#2696)
  • [Ray] refine ray dataset integration (#2705)
  • Improve profiling (#2629)
  • Add support for reading partitioned parquet for fastparquet (#2724)
  • Introduce coloring based fusion algorithm (#2719)
  • Fix duplicate exceptions in log (#2723)

Bug fixes

  • Fix sort_values for empty DataFrame or Series (#2681)
  • Eliminate redundant eval node in optimization (#2683)
  • Avoid iterative tiling for df.loc[:, fields] (#2685)
  • [hotfix][ray] fix ray dataset compatibility (#2693)
  • Fix use_arrow_dtype parameter for read_parquet (#2698)
  • Fix error on dependent DataFrame setitems (#2701)
  • Fix estimate_pandas_size for pd.MultiIndex (#2707)
  • Import vineyard.data.pickle to make members available. (#2714)
  • Fix shuffle when ndim of input tensors are different (#2727)

Documentation

  • Add Slack invite link (#2704, thanks @yuyiming!)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.8.1

This is the release notes of v0.8.1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for GroupBy.{ffill, bfill,fillna} (#2657, thanks @Marascax!)
    • Add nunique support for DataFrameGroupBy (#2667)

Enhancements

  • Add support for HTTP request rewriter (#2665)
  • Add merging small files support for md.{read_parquet, read_csv} (#2669)
  • Optimize filtering DataFrame with its fields (#2668)

Bug fixes

  • Allow specifying multiple supervisor processes (#2625)
  • Fix backward compatibility for pandas 1.0 (#2630)
  • Fix NotImplementedError for mo.batch when single call not implemented (#2637)
  • Fix compatibility for pandas 1.4 (#2652)
  • Fix IndexError raise by aggregation of DataFrameGroupBy (#2653)
  • Fix df.loc[:] to make sure same index_value key generated (#2654)
  • Fix aggregation with comparison (#2655)
  • Fix the wrong index_value generated by df.loc:
  • Fix as_index when calling groupby-agg (#2678)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.9.0a2

This is the release notes of v0.9.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for GroupBy.{ffill, bfill,fillna} (#2639, thanks @Marascax!)
    • Add nunique support for DataFrameGroupBy (#2662)
  • Others
    • Add wheel support for Python 3.10 and drop Python 3.6 (#2622)

Enhancements

  • Added merging small files support for md.{read_parquet, read_csv} (#2661)
  • Add support for HTTP request rewriter (#2664)
  • Optimize filtering DataFrame with its fields (#2571)
  • Add pyproject.toml to config build packages (#2674)

Bug fixes

  • Fix backward compatibility for pandas 1.1 and 1.2 (#2624)
  • Fix backward compatibility for pandas 1.0 (#2628)
  • Fix NotImplementedError for mo.batch when single call not implemented (#2635)
  • Fix IndexError raise by aggregation of DataFrameGroupBy (#2641)
  • Fix compatibility for pandas 1.4 (#2650)
  • Fix df.loc[:] to make sure same index_value key generated (#2643)
  • Fix aggregation with comparison (#2647)
  • Fix the wrong index_value generated by df.loc:
  • Fix optimizing DataFrame query with timestamp in conditions (#2671)
  • Fix as_index when calling agg on SeriesGroupBy (#2676)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.8.0

This is the release notes of v0.8.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.8.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:

alpha1 alpha2 alpha3 beta1 beta2 rc1

New Features

  • Tensor
    • Implements mt.bincount (#2552)
  • DataFrame
    • Support Series.median (#2570, thanks @perfumescent!)
  • Learn
    • Add mars.learn.metrics.multilabel_confusion_matrix and derivative metrics (#2568)

Enhancements

  • Implement web API of get_infos (#2564)
  • Reduce time cost of cpu_percent() calls (#2572)
  • Stop calling user funcs when dtypes is specified (#2596)
  • Supports adding Mars extensions via setup entrypoints (#2598)
  • [Ray] Refine mars on ray usability (#2606)
  • Reduce estimation time cost (#2607)
  • Skip details of shuffled chunks in meta (#2609)
  • Reduce the time cost of fetching tileable data (#2616)
  • Reduce RPC cost of oscar by removing unnecessary tasks (#2613)
  • Use batched request to apply for slots (#2615)

Bug fixes

  • Fix index series.apply when result index unchanged (#2563)
  • Fix DataFrame getitem when exists duplicate columns (#2582)
  • Upgrade required version of vineyard (#2593)
  • Fix progress always is 0 or 100% (#2595)
  • Fix None dtype for some unary tensor functions (#2604)
  • Make Proxima work with latest Mars (#2605, thanks @yuyiming!)
  • Fix tests for cudf 21.10 (#2608)
  • Fix duplicate decref of subtask input chunk (#2614, thanks @Catch-Bull!)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.9.0a1

This is the release notes of v0.9.0a1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mt.bincount (#2548)
  • DataFrame
    • Support Series.median() (#2566, thanks @perfumescent!)
  • Learn
    • Add mars.learn.metrics.multilabel_confusion_matrix and derivative metrics (#2554)
  • Services
    • Add basic profiling support for supervisor (#2586)

Enhancements

  • Add appqueue in newcluster (#2550, thanks @xxxxsk!)
  • Implement web API of get_infos (#2558)
  • Reduce time cost of cpu_percent() calls (#2567)
  • Reduce estimation time cost (#2577)
  • [ray] refine mars on ray usability (#2580)
  • [ray] Refine raydataset integration (#2579)
  • Optimize tileable graph construction (#2583)
  • Stop calling user funcs when dtypes is specified (#2587)
  • Supports adding Mars extensions via setup entrypoints (#2589)
  • Skip details of shuffled chunks in meta (#2600)
  • Reduce the time cost of fetching tileable data (#2594)
  • Use batched request to apply for slots (#2601)
  • Reduce RPC cost of oscar by removing unnecessary tasks (#2597)

Bug fixes

  • Fix index series.apply when result index unchanged (#2557)
  • Stop using asdict to handle dataclasses (#2561)
  • Fix tests under cudf 21.10 (#2608)
  • Fix DataFrame getitem when exists duplicate columns (#2581)
  • Upgrade required version of vineyard. (#2588)
  • Fix progress always is 0 or 100% (#2591)
  • Make Proxima work with latest Mars (#2599, thanks @yuyiming!)
  • Fix None dtype for some unary tensor functions (#2603)
  • Fix duplicate decref of subtask input chunk (#2611, thanks @Catch-Bull!)

Documentation

  • Add a document about how to implement a Mars operand (#2562)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.7.5

This is the release notes of v0.7.5. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Add preliminary implementations for ufunc methods (#2513)
    • Add partial support for setitem with fancy indexing (#2544)
  • DataFrame
    • Implements md.get_dummies (#2534, thanks @hoarjour!)
  • Learn
    • Add make_regression support for learn module (#2517)
    • Implements mars.learn.preprocessor.LabelEncoder (#2545)
  • Services
    • Add web API for scheduling (#2535)
  • Web
    • Display tileable properties on web (#2539, thanks @RandomY-2!)
  • Others
    • Add experimental support for CUDA under WSL for Windows 11 (#2543)

Enhancements

  • Reduce indentation of frontend code (#2541)

Bug fixes

  • Fix output of df.groupby(as_index=False).size() (#2508)
  • Fix reduction result on empty series (#2522)
  • Fix df.loc when df is empty (#2526)
  • [Ray] Fix serializing lambdas in web (#2529)
  • Fix df.loc when providing empty list (#2532)

Documentation

  • Add doc for reading csv in oss (#2530, thanks @Catch-Bull!)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.8.0rc1

This is the release notes of v0.8.0rc1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Add preliminary implementations for ufunc methods (#2510)
    • Add partial support for setitem with fancy indexing (#2453)
  • DataFrame
    • Support md.get_dummies() (#2323, thanks @hoarjour!)
  • Learn
    • Add make_regression support for learn module (#2515)
    • Implements fit and predict methods for bagging (#2516)
    • Implements mars.learn.ensemble.IsolationForest (#2531)
    • Implements mars.learn.preprocessor.LabelEncoder (#2542)
  • Services
    • Add web API for scheduling (#2533)
  • Web
    • Display tileable properties on web (#2525, thanks @RandomY-2!)
  • Others
    • Support mutable tensor on oscar (#2432, thanks @Coco58323!)
    • Add experimental support for CUDA under WSL for Windows 11 (#2538)

Enhancements

  • Use black to enforce code style (#2492)
  • Reduce indentation of frontend code (#2540)

Bug fixes

  • Fix output of df.groupby(as_index=False).size() (#2507)
  • [Ray] Fix web serialize lambda (#2512)
  • Fix reduction result on empty series (#2520)
  • Fix DataFrame.loc when df is empty (#2524)
  • Fix df.loc when providing empty list (#2528)

Documentation

  • Add doc for reading csv in oss (#2514, thanks @Catch-Bull!)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.7.4

This is the release notes of v0.7.4. See here for the complete list of solved issues and merged PRs.

New Features

  • Web
    • Split tileable information and subtask graph into two tabs (#2482, thanks @RandomY-2!)
    • Include tileable property in detail api (#2499, thanks @RandomY-2!)

Enhancements

  • Support specified vineyard socket and skip the launching vineyardd process (#2500)
  • Refine MarsDMatrix & support more parameters for XGB classifier and regressor (#2501)

Bug fixes

  • Compatible with scikit-learn 1.0 (#2487)
  • Fix bug that failed to execute query when there are multiple arguments (#2491, thanks @perfumescent!)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.8.0b2

This is the release notes of v0.8.0b2. See here for the complete list of solved issues and merged PRs.

New Features

  • Learn
    • Implements glm.LogisticRegression (#2466, thanks @Fernadoo!)
    • Implements bagging sampling (#2496)
  • Services
    • Basic reschedule subtask (#2467)
  • Web
    • Split tileable information and subtask graph into two tabs (#2480, thanks @RandomY-2!)
    • Include tileable property in detail api (#2493, thanks @RandomY-2!)

Enhancements

  • Support specified vineyard socket and skip the launching vineyardd process (#2481)
  • Refine MarsDMatrix & support more parameters for XGB classifier and regressor (#2498)

Bug fixes

  • Compatible with scikit-learn 1.0 (#2486)
  • Fix bug that failed to execute query when there are multiple arguments (#2490, thanks @perfumescent!)

Documentation

  • Fix wrong translation in cluster deployment. (#2489, thanks @perfumescent!)

Tests

  • Fix version of statsmodels to pass CI (#2497)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.7.3

This is the release notes of v0.7.3. See here for the complete list of solved issues and merged PRs.

New Features

  • Learn
    • Add _binary_roc_auc_score method (#2477, thanks @Divyanshu-Singh-Chauhan!)
  • Web
    • Support visualizing subtask graphs on Mars Web (#2471, thanks @RandomY-2!)
  • Others
    • Revisit {from,to}_vineyard for tensors and dataframes (#2436)
    • Add nightly builds for docker images (#2462)
    • Make cmdline support third party modules (#2472)

Bug fixes

  • Fix df/series.{apply, map_chunk} when some chunk output empty data (#2437)
  • Fix missing DAGs when initializing with empty API results (#2442, thanks @RandomY-2!)
  • Fix skew and kurt errors under MacOS (#2445)
  • Fix usage of kubernetes image (#2448)
  • Fix timeout error when waiting for a submitted task (#2461)
  • Fix misuse of name parameter in DataFrame align (#2473, thanks @qxzhou1010!)
  • Fix hang when start sub pool fails (#2476)

Tests

  • Fix coverage for Azure pipeline (#2475)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.8.0b1

This is the release notes of v0.8.0b1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Integrate Mars DataFrame with Ray Dataset (#2393, thanks @vcfgv!)
  • Learn
    • Add _binary_roc_auc_score method (#2403, thanks @Divyanshu-Singh-Chauhan!)
  • Web
    • Support visualizing subtask graphs on Mars Web (#2426, thanks @RandomY-2!)
  • Others
    • Revisit {from,to}_vineyard for tensors and dataframes. (#2419)
    • [Ray] Reconstruct worker (#2413)
    • Make cmdline support third party modules (#2454)
    • Add nightly builds for docker images (#2456)

Enhancements

  • Refine and unify subtask detail APIs (#2465, thanks @RandomY-2!)

Bug fixes

  • Fix df/series.{apply, map_chunk} when some chunk output empty data (#2434)
  • Fix missing DAGs when initializing with empty API results (#2439, thanks @RandomY-2!)
  • Fix skew and kurt errors under MacOS (#2443)
  • Add tests for public kubernetes image (#2446)
  • Fix timeout error when waiting for a submitted task (#2457)
  • Print the error message when error happens in TaskProcessor (#2458)
  • Fix misuse of name parameter in DataFrame align (#2469, thanks @qxzhou1010!)
  • Fix hang when start sub pool fails (#2468)

Installation

  • Build and upload docker images in continuous deployment (#244)

Tests

  • Fix coverage for Azure pipeline (#2474)

- Python
Published by qinxuye over 4 years ago

https://github.com/mars-project/mars - v0.7.2post1

This release is a hotfix of v0.7.2 in order to fix the public docker image.

Bug fixes

  • Fix usage of kubernetes image for v0.7.2 (#2447)

- Python
Published by wjsi almost 5 years ago

https://github.com/mars-project/mars - v0.7.2

This is the release notes of v0.7.2. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements hypergeometric functions (#2408, thanks @Alfa-Shashank!)
    • Implements mars.tensor.append (#2422)
  • DataFrame
    • Implements Series.between (#2382, thanks @gowrijsuria!)
    • Implements DataFrame.transpose() (#2423, thanks @hoarjour!)
  • Learn
    • Add mars.learn.ensemble.{BlockwiseVotingClassifier, BlockwiseVotingRegressor} (#2391)
    • Add TensorFlow dataset (#2409, thanks @yuanchongtt!)
    • Implements linear_model.LinearRegression (#2411, thanks @Fernadoo!)
    • Implements mars.learn.preprocessing.{LabelBinarizer, label_binarize} (#2421)
    • Implements mars.learn.metrics.log_loss (#2424)
    • Implements mars.learn.wrappers.ParallelPostFit (#2427)
  • Web
    • API for subtask DAG structure (#2410, thanks @RandomY-2!)

Bug fixes

  • Fix raising wrong error for an operand when post_execute implemented and error occurs in execute (#2396)

Tests

  • Improve case stability (#2387)
  • Change all tests to use relative import (#2412)

- Python
Published by qinxuye almost 5 years ago

https://github.com/mars-project/mars - v0.8.0a3

This is the release notes of v0.8.0a3. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implemented hypergeometric functions (#2397, thanks @Alfa-Shashank!)
    • Implements mars.tensor.append (#2417)
  • DataFrame
    • Implements Series.between (#2368, thanks @gowrijsuria!)
    • Integrate Mars DataFrame with Ray MLDataset (#2294, thanks @vcfgv!)
    • Support DataFrame.transpose() (#2327, thanks @hoarjour!)
  • Learn
    • Add mars.learn.ensemble.{BlockwiseVotingClassifier, BlockwiseVotingRegressor} (#2390)
    • Implements linear_model.LinearRegression (#2260, thanks @Fernadoo!)
    • Add TensorFlow dataset (#2383, thanks @yuanchongtt!)
    • Implements mars.learn.preprocessing.{LabelBinarizer,label_binarize} (#2415)
    • Implements mars.learn.metrics.log_loss (#2418)
    • Implements mars.learn.wrappers.ParallelPostFit (#2425)
  • Services
    • Initially support auto scaling (#2210)
  • Web
    • API for subtask DAG structure (#2389, thanks @RandomY-2!)

Bug fixes

  • Fix raising wrong error for an operand when post_execute implemented and error occurs in execute (#2395)
  • [Ray] Fix occasionally failed unittest test_ownership_when_scale_in (#2401)
  • [Oscar] Fix possible ActorCaller.call hang (#2404, thanks @Catch-Bull!)

Documentation

  • Highlight dask-on-mars in doc (#2399)

Tests

  • Improve case stability (#2381)
  • Upgrade vineyard to v0.2.7 (#2193)
  • Add checks for file mode changes and absolute imports (#2398)
  • [Ray] Fix ray version (#2406)
  • Change all tests to use relative import (#2407)

- Python
Published by qinxuye almost 5 years ago

https://github.com/mars-project/mars - v0.7.1

This is the release notes of v0.7.1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Support md.to_numeric (#2334, thanks @hoarjour!)
    • Gives an error when input DataFrame has unknown dtypes (#2359)
    • Implements DataFrame.assign (#2369, thanks @hxri!)
    • Support reading csv file from oss (#2374, thanks @zebivy!)
  • Tensor
    • Implements mars.tensor.stats.ks_2samp (#2332)
    • Implements mt.stats.ks_1samp (#2341)
  • Learn
    • Support PyTorch Dataset for oscar (#2364, thanks @yuanchongtt!)
    • Add KFold support (#2365)
  • Services
    • Add API to retrieve progress and status of tileables (#2358)
  • Web
    • Add visualization page for tileable graphs (#2319, thanks @RandomY-2!)
    • Add storage infos in web (#2333)
    • Display tileable progress, status and dependency link type on task detail page (#2377, thanks @RandomY-2!)

Enhancements

  • Support setting multiple columns in DataFrame (#2313)
  • Create service classes to manage service and session operations (#2331)
  • Remove bokeh from package requirements (#2344)
  • Optimize scheduling service on supervisors (#2347)
  • Improve waitactorpool_recovered (#2350, thanks @keyile!)

Bug fixes

  • Fix the error when multiple subtasks fetch the same data (#2340)
  • Fix KeyError when remote function returns None (#2375)
  • Fix DataFrame comparison when data type is period (#2376)

Documentation

  • Fix untranslated strings in doc (#2349)

- Python
Published by qinxuye almost 5 years ago

https://github.com/mars-project/mars - v0.8.0a2

This is the release notes of v0.8.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Support initializing Mars objects from CUDA (#2308)
    • Support md.to_numeric (#2290, thanks @hoarjour!)
    • Gives an error when input DataFrame has unknown dtypes (#2355)
    • Added assign to DataFrame (#2362, thanks @hxri!)
    • Support reading csv file from oss (#2292, thanks @zebivy!)
  • Tensor
    • Implements mars.tensor.stats.ks_2samp (#2324)
    • Implements mars.tensor.stats.ks_1samp (#2335)
  • Learn
    • Support PyTorch Dataset for oscar (#2246, thanks @yuanchongtt!)
    • Add KFold support (#2363)
  • Services
    • Add API to retrieve progress and status of tileables (#2357)
  • Web
    • Add visualization page for tileable graphs (#2282, thanks @RandomY-2!)
    • Add storage infos in web (#2317)
    • Display tileable progress, status and dependency link type on task detail page (#2360, thanks @RandomY-2!)
  • Others
    • [Ray] Rerun subtask for ray backend (#2288, thanks @keyile!)
    • Add experimental Dask-on-Mars support (#2289, thanks @loopyme!)

Enhancements

  • Support setting multiple columns in DataFrame (#2303)
  • Refactor tileable visualization classes (#2318)
  • Create service classes to manage service and session operations (#2326)
  • Improve waitactorpool_recovered (#2328, thanks @keyile!)
  • Remove bokeh from package requirements (#2339)
  • Optimize mars supervisor scheduling (#2325)

Bug fixes

  • Fix hangs when worker main pool has failures. (#2286)
  • Fix the error when multiple subtasks fetch the same data (#2322)
  • [Ray] Fix ray ci (#2343, thanks @keyile!)
  • Fix error in Dask-on-Mars when compute multiple objects (#2348, thanks @loopyme!)
  • Fix KeyError when remote function returns None (#2371)
  • Fix DataFrame comparison when data type is period (#2373)

Documentation

  • Fix untranslated strings in doc (#2346)
  • Fix docs of DataFrame.assign (#2367)

- Python
Published by qinxuye almost 5 years ago

https://github.com/mars-project/mars - v0.7.0

This is the release notes of v0.7.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.7.0rc2; for all highlights and changes, please refer to the release notes of the pre-releases:

alpha1 alpha2 alpha3 alpha4 alpha5 alpha6 alpha7 alpha8 beta1 beta2 rc1 rc2

Changes that break compatibility

v0.7.0 has unified local and distributed execution layer, local thread-based scheduling has been removed, instead, the unified runtime is based on multiprocess-based scheduling which could get rid of infamous GIL problem .

Thus, for local usage, please new a local default session via:

```python import mars

mars.new_session() # create a default local session ```

If not doing so, it will be initialized once in the background, however, keep in mind that the initialization of multiprocess scheduling consumes more time compared to multithread one.

We tried our best to keep other compatibilities, if you find any incompatible place, please open an issue to reach out to us.

Highlights

v0.7.0 implements a unified execution layer, all deployment including bare metal, Kubernetes, Ray as well as Yarn shares the same fundamental components. This unified execution layer optimized many aspects compare to the old one including:

  • Better serialization based on pickle5 protocol, which is 5-7x faster than old version.
  • Completely rewritten execution layer which has better performance, even 20%-50% faster than the old version on a laptop.
  • Based on multiprocess scheduling which avoids infamous GIL issue.
  • Mars on Ray is way more better due to the reason that Ray actor is leveraged to build the Ray backend of Oscar which is a lightweight actor framework that is the fundamental part of the entire execution layer.
  • GPU can be supported more better with the new architecture.

New Features

  • Tensor
    • Add partial support of bessel functions (#2274, thanks @JuntaoMa!)
    • Implements mars.tensor.in1d (#2301)
  • Learn
    • Implements mars.learn.utils.multiclass.unique_label (#2300)
  • Services
    • Add getstoragelevel_info api (#2242)
    • Add API to fetch tileable graph as JSON (#2271, thanks @RandomY-2!)
    • Enable running on GPU for oscar (#2306)
  • Others
    • Add support for seek method in memory cases (#2264)

Enhancements

  • Add support for stateless actors (#2220)
  • Add status filters for Cluster service (#2221)
  • Pass logging config file name into sub pools (#2225)
  • Support choosing aggregation algorithm at runtime (#2226)
  • Add method to session to get web endpoint (#2238)
  • Use Kubernetes Service to discover Mars Supervisors (#2240)
  • Ensure range index incremental for data source op like md.read_csv (#2244)
  • Record mapper meta for shuffle task (#2255)
  • Support data dependency for run_script (#2256)
  • Refine oscar debugging (#2261)
  • Support fetch_log for web session (#2262)
  • Allow turning off actor killing (#2277)
  • Use batch method to reduce transferring cost for shuffle tasks (#2279)
  • Assign bands given devices of subtasks (#2278)
  • Add bind method to facilitate extracting batch args (#2281)
  • Reduce memory estimation for specific operands (#2285)

Bug fixes

  • Fix NoDataToSpill when multiple storage quota requests happen simultaneously (#2223)
  • Stop using thread local to store default session (#2243)
  • Fix service errors in Windows (#2247)

Documentation

  • Doc refinement for Oscar (#2291)
  • Add docs for batch methods (#2298)

Installation

  • Merge default & distributed requirements (#2270)

Tests

  • Add separate check pipeline (#2302)

- Python
Published by qinxuye almost 5 years ago

https://github.com/mars-project/mars - v0.8.0a1

This is the release notes of v0.8.0a1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Add partial support of bessel functions (#2258, thanks @JuntaoMa!)
    • Implements mars.tensor.in1d (#2297)
  • Learn
    • Implements mars.learn.utils.multiclass.unique_label (#2295)
  • Services
    • Add getstoragelevel_info api (#2228)
    • Basic rerun subtask (#2198)
    • Add API to fetch tileable graph as JSON (#2253, thanks @RandomY-2!)
    • Enable running on GPU for oscar (#2284)
  • Others
    • Add support for seek method in memory cases (#2250)

Enhancements

  • Support choosing aggregation algorithm at runtime (#2213)
  • Add support for stateless actors (#2218)
  • Add status filters for Cluster service (#2214)
  • Reassign subtasks and filter nodes with status (#2159, thanks @vcfgv!)
  • Add methods to sessions to get web endpoint (#2236)
  • Ensure range index incremental for data source op like md.read_csv (#2232)
  • Use Kubernetes Service to discover Mars Supervisors (#2227)
  • Record mapper meta for shuffle task (#2248)
  • Support data dependency for run_script (#2251)
  • Refine oscar debugging (#2252)
  • Support fetch_log for web session (#2257)
  • Use batch method to reduce transferring cost for shuffle tasks (#2233)
  • Allow turning off actor killing (#2273)
  • Assign bands given devices of subtasks (#2276)
  • Add bind method to facilitate extracting batch args (#2280)
  • Reduce memory estimation for specific operands (#2283)

Bug fixes

  • Fix NoDataToSpill when multiple storage quota requests happen simultaneously (#2203)
  • Pass logging config file name into sub pools (#2222)
  • Stop using thread local to store default session. (#2217)
  • Fix possible CI failure when destroying remote object for incremental index (#2239)
  • Fix service errors in Windows (#2237)

Documentation

  • Doc refinement for Oscar (#2234)
  • Add docs for batch methods (#2293)

Installation

  • Merge default & distributed requirements (#2263)

Tests

  • Add separate check pipeline (#2299)
  • Fix delocate version to 0.8.2 to avoid deploy error (#2305)

- Python
Published by wjsi almost 5 years ago

https://github.com/mars-project/mars - v0.6.11

This is the release notes of v0.6.11. See here for the complete list of solved issues and merged PRs.

Bug fixes

  • Fix unexpected NaNs in groupby-agg (#2178)
  • Fix groupby on indexes with duplicate items (#2187)
  • Fix compatibility issue for pandas 1.3 (#2202)
  • Fix mergeindexvalue when index_values come from multi range indexes (#2208)

- Python
Published by qinxuye almost 5 years ago

https://github.com/mars-project/mars - v0.7.0rc2

This is the release notes of v0.7.0rc2. See here for the complete list of solved issues and merged PRs.

New Features

  • Services
    • Support setting task progress in context (#2192)
  • Web
    • Implement essential APIs for Web (#2181)
    • Use React framework to rewrite Mars UI (#2135)
  • Deploy
    • Cluster config support third party modules (#2171)

Enhancements

  • Use aiohttp to handle web requests (#2183)
  • Add stop methods for all services (#2194)
  • Move fetch method from StorageManagerActor to StorageHandlerActor (#2196)
  • Isolate client and cluster in a separated event loop and thread (#2168)

Bug fixes

  • Fix unexpected NaNs in groupby-agg (#2177)
  • Fix groupby on indexes with duplicate items (#2186)
  • Fix starting multiple workers in shared file system (#2189)
  • Fix import error in master branch (#2190)
  • [Ray] ray two way hang-detectable channel (#2170)
  • Fix compatibility issue for pandas 1.3 (#2197)
  • Fix deserializing task errors on web clients (#2199)
  • Make window function under old pandas versions work (#2204)
  • Fix concatenating row chunks with MultiIndex (#2205)
  • Fix mergeindexvalue when index_values come from multi range indexes (#2207)

- Python
Published by qinxuye almost 5 years ago

https://github.com/mars-project/mars - v0.7.0rc1

This is the release notes of v0.7.0rc1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implement {DataFrame, Series}.empty (#2155, thanks @Fernadoo!)
  • Services
    • Add spill support for oscar (#2160)

Enhancements

  • Migrate YARN support for Oscar (#2152)
  • Add version API on cluster (#2153)
  • Make sure slot usages updated when some task ends (#2161)
  • Add debug options for oscar (#2164)
  • Add fault injection subtask processor for tests (#2163)
  • Collects available ports before running tensorflow scripts (#2169)
  • Support stopping tasks for interactive execution (#2154)
  • Add cycle detection for oscar debug mode (#2167)
  • Update year of license (#2172)
  • Support configuring overriding fields (#2173)

Bug fixes

  • Fix race condition when creating actors with IdleLabel strategy in a parallel way (#2147)
  • Fix shuffle task on oscar (#2146)

Tests

  • Use a setup.cfg to gather configurations (#2150)

- Python
Published by qinxuye almost 5 years ago

https://github.com/mars-project/mars - v0.6.10

This is the release notes of v0.6.10. See here for the complete list of solved issues and merged PRs.

New Features

  • Implement {DataFrame, Series}.empty (#2174, thanks @Fernadoo!)

Bug fixes

  • Fix _get_ports_from_netstat hang (#2174)

- Python
Published by qinxuye almost 5 years ago

https://github.com/mars-project/mars - v0.7.0b2

This is the release notes of v0.7.0b2. See here for the complete list of solved issues and merged PRs.

Changes that break compatibility

  • From v0.7.0b2 on, staled threading-based scheduler as well as distributed scheduler based on Mars actor 1.0 have been removed, thus clients with older versions are completely incompatible.

Highlights

  • Unified scheduling based on Oscar which is Mars actor 2.0 is ready for tests.

New Features

  • DataFrame
    • Add add_prefix support (#2132, thanks @aeinrw!)
  • Services
    • Services web handler and api (#2102)
    • Implements lifecycle service (#2117)
    • Add initial implementation of scheduling service (#2111)
  • Deloy
    • Add command line support for Oscar deployment (#2131)
  • Ray
    • [Ray] ray oscar deploy (#2089)

Enhancements

  • Hold data ref in DataManager (#2090)
  • Enabling iterative tiling etc support for task service (#2097)
  • Enable optimization for task service (#2098)
  • Implement last_idle_time API (#2099)
  • Integrate mars object check for session (#2103)
  • Make Mars pools compatible with Python 3.6 (#2110)
  • [ray] optimize ray deploy speed (#2118)
  • Implement RESTful web API (#2120)
  • [ray] Support supervisor exclusive node option (#2121)
  • Add asyncio task timeout debugger (#2127)
  • Add transfer support for storage service (#2100)
  • [ray] supervisor support sub pool (#2128)
  • Configure azure pipelines job timeout (#2139)
  • Allow overriding service config files (#2140)

Bug fixes

  • Fix distributed make_blobs and column pruning in read_sql (#2092)
  • Fix result error when yield after exceptions (#2096)
  • Wrap sync method in session so that they will be running in threads (#2109)
  • Fix pool cases and shared memory cases in Windows (#2114)
  • Fix _get_ports_from_netstat hang (#2116)
  • Fix unpickle mars config error (#2130)
  • Filter pipeline jobs by branch (#2138)

Tests

  • Fix coverage result on SubActorPool (#2095)
  • [Ray] Support ray subprocess covarage (#2101)
  • Run operand tests in Azure Pipelines (#2137)
  • Migrate tensor/dataframe/learn tests to oscar (#2106)

- Python
Published by qinxuye about 5 years ago

https://github.com/mars-project/mars - v0.6.9

This is the release notes of v0.6.9. See here for the complete list of solved issues and merged PRs.

New Features

  • Add add_prefix support (#2133, thanks @aeinrw!)

Bug fixes

  • Fix distributed make_blobs and column pruning in read_sql (#2093)

Tests

  • Remove tokens for codecov on 0.6 (#2141)

- Python
Published by qinxuye about 5 years ago

https://github.com/mars-project/mars - v0.7.0b1

This is the release notes of v0.7.0b1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mt.block (#2069, thanks @fyrestone!)
  • Project Galois
    • Initial task service support (#2084)

Enhancements

  • Apply new serialization to operand and Mars objects (#2075)
  • Make import of LightGBM and XGBoost lazy (#2083)
  • Use chunk index as default shuffle key (#2086)
  • Project Galois
    • Oscar actor pool on ray (#2063, thanks @chaokunyang!)

Bug fixes

  • Use a ref count to make delete works when multiple workers connect to the same vineyardd (#2077)

Tests

  • Allow cancelling flows automatically (#2082)

- Python
Published by qinxuye about 5 years ago

https://github.com/mars-project/mars - v0.6.8

This is the release notes of v0.6.8. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mt.block (#2080, thanks @fyrestone!)

Enhancements

  • Make import of LightGBM and XGBoost lazy (#2085)

Bug fixes

  • Fix mt.unique on empty arrays (#2061)
  • Use a ref count to make delete works when multiple workers connect to the same vineyard (#2078)
  • Backport of bug fixes discovered in Galois serialization refactor (#2081)

Tests

  • Fix batch get / delete object id and stop launching plasma when working on vineyard (#2071)

- Python
Published by qinxuye about 5 years ago

https://github.com/mars-project/mars - v0.7.0a8

This is the release notes of v0.7.0a8. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implement mt.insert and mt.delete (#2039)

Project Galois

  • Oscar
    • [Ray backend] setup_cluster use placement group (#2041)
    • Fix cancelling actor promises (#2032)
    • Fix __pre_destroy__ not called when actors are in main pool (#2045)
    • [Ray] Fix ray cluster_utils import (#2054)
    • Fix main pool that creating servers only when sub pools finished creation (#2064)
  • Services
    • Initialize meta service with mock support (#2034)
    • API definition for services (#2040)
    • Allow sync actor methods to be extensible (#2050)
    • Implements initial version of cluster service (#2049)
    • Enhance meta service (#2062)
    • Initial implementation for storage service (#2056)
    • Add bands to chunk meta (#2065)
    • Enrich usage experience for extensible function (#2067)
    • Add API interface to get data info (#2068)
  • Serialization
    • Support cuda buffer serializations in actor communication (#2031)
    • Allow serializer to serialize recursive objects (#2058)
    • Implements serializables based on new serialization mechanism (#2051)

Enhancements

  • Support reusing kubedl cluster by job name (#2035)
  • Quarantine asyncio tests when measuring Cython coverage (#2070)

Bug fixes

  • Fix wrong results of mt.insert (#2046)
  • Fix for mt.insert when insert values is a mars tensor (#2052)
  • Fix batch get / delete object id and stop launching plasma when working on vineyard (#2072)

- Python
Published by qinxuye about 5 years ago

https://github.com/mars-project/mars - v0.6.7

This is the release notes of v0.6.7. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implement mt.insert and mt.delete (#2042)

Enhancements

  • Support reusing kubedl cluster by job name (#2036)

Bug fixes

  • Re-enable the from/to vineyard test cases, and set meta for tensor/dataframe properly(#2030)
  • Fix wrong results of mt.insert (#2048)
  • Fix for mt.insert when insert values is a mars tensor (#2053)

- Python
Published by qinxuye about 5 years ago

https://github.com/mars-project/mars - v0.7.0a7

This is the release notes of v0.7.0a7. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements {DataFrame, Series}.pct_change (#2014)
  • Tensor
    • Implements tree arithmetic for tensor add and multiplication (#2024)

Project Galois

  • Oscar
    • Add support for batch interfaces for actors (#2013)
    • [oscar] Add cancel support, optimize error handling, add kill_actor API (#2027)
  • Service
    • Add initial service implementations (#2010)

Enhancements

  • Use mmap files to reduce memory usage in proxima builder (#1866)
  • Support setting column with different index for DataFrame (#2020)

Bug fixes

  • Fix errors when calling where() on reshape results (#2011)
  • Fix log error when yielding to another remote (#2022)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.6

This is the release notes of v0.6.6. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements {DataFrame, Series}.pct_change (#2015)
  • Tensor
    • Implements tree arithmetic for tensor add and multiplication (#2028)

Enhancements

  • Use mmap files to reduce memory usage in proxima builder (#2016)
  • Support setting column with different index for DataFrame (#2025)

Bug fixes

  • Fix IndexError in Series.sort_values when some chunk is empty (#2001)
  • Fix mars crashes on ray >= 1.2.0 (#2003, thanks @fyrestone!)
  • Add errors argument for groupby.sample to ignore errors when group size less than n (#2007)
  • Fix errors when calling where() on reshape results (#2012)
  • Fix log error when yielding to another remote (#2026)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.7.0a6

This is the release notes of v0.7.0a6. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements Index.__getitem__ (#1971)
    • Implements {DataFrame,Series}.sample (#1983)
    • Implements DataFrameGroupBy.sample (#1994)
  • Tensor
    • Implements stats.chisquare (#1974)
    • Implement ttests and gamma functions (#1986)

Project Galois

  • Oscar
    • [oscar] Fix actor promise & add tests (#1958)
    • [oscar] Add communication layer for Mars backend (#1989)
    • [oscar] Implements Mars backend for oscar (#1996)
  • Storage
    • [storage][vineyard] Implement storage lib of vineyard backend (#1952, thanks @acezen!)
    • [storage][sharedmemory] Add storage backend of `multiprocessing.sharedmemory` (#1969)
    • [storage][cuda] Add cuda backend storage implementation (#1981)
    • [storage][ray] Implements Ray storage (#1992, thanks @fyrestone!)

Enhancements

  • Allow wrapping existing models with Mars class constructors (#1956)
  • Optimize performance of DataFrame.describe() (#1961)
  • Initialize filesystem and aio libs (#1980)

Bug fixes

  • Fix MarsDMatrix when input tensor has unknown chunk shape (#1966)
  • Fix tensor sorting with empty chunks (#1968)
  • Re-enable the from/to vineyard test cases, and set meta for tensor/dataframe properly. (#1967)
  • Fix ValueError when reducing tensors with empty chunks (#1978)
  • Fix job hang when error message can't be pickled (#1990)
  • Fix IndexError in Series.sort_values when some chunk is empty (#1999)
  • Fix mars crashes on ray >= 1.2.0 (#1998, thanks @fyrestone!)
  • Add errors argument for groupby.sample to ignore errors when group size less than n (#2002)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.5

This is the release notes of v0.6.5. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements Index.__getitem__ (#1975)
    • Implements {DataFrame,Series}.sample (#1987)
    • Implements DataFrameGroupBy.sample (#1995)
  • Tensor
    • Implements stats.chisquare (#1976)
    • Implement ttests and gamma functions (#1988)

Enhancements

  • Allow wrapping existing models with Mars class constructors (#1957)
  • Optimize performance of DataFrame.describe() (#1962)
  • Initialize filesystem libs (#1982)

Bug fixes

  • Fix tensor sorting with empty chunks (#1973)
  • Fix MarsDMatrix when input tensor has unknown chunk shape (#1970)
  • Fix ValueError when reducing tensors with empty chunks (#1979)
  • Fix job hang when error message can't be pickled (#1993)

Tests

  • Add tests and releases for Python 3.9 (#1955)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.7.0a5

This is the release notes of v0.7.0a5. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements DataFrame.{eval,query} (#1898)
    • Implements {DataFrame, Series}.duplicated() (#1907)
    • Implements is_monotonic properties (#1939)
    • Implements {DataFrame,Series}.set_axis (#1950)

Project Galois

  • [oscar] Add actor driver & structure adjustment (#1925)
  • [oscar][ray backend] Actor creation (#1916, thanks @fyrestone!)
  • Add new serializer implementation (#1937)
  • Implement storage lib of Arrow plasma as well as disk (#1904)

Enhancements

  • Allow set verify_ssl to False for kubernetes configuration (#1911)
  • Optimize generating mock DataFrames (#1913)
  • Move opcodes out of protobuf definition (#1944)

Bug fixes

  • To vineyard: avoid copy when chunks are already in vineyard (vineyard is the backend). (#1899)
  • Fix rechunk when input tileable has unknown shape (#1912)
  • Fix KeyError when comparing series (#1920)
  • Fix rechunk when chunks have different dtypes that cannot compare (#1922)
  • Collect available ports before running LightGBM task (#1927)
  • Fix KeyError when column pruning is applied (#1929)
  • Fix shuffling data in mars.learn module (#1931)
  • Fix memory estimation of StartTracker for XGBoost (#1934)
  • Fix accuracy_score for distributed execution (#1945)

Tests

  • Add tests and releases for Python 3.9 (#1954)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.4

This is the release notes of v0.6.4. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements DataFrame.{eval,query} (#1900)
    • Implements {DataFrame, Series}.duplicated() (#1909)
    • Implements is_monotonic properties (#1946)
    • Implements {DataFrame,Series}.set_axis (#1951)

Enhancements

  • Optimize generating mock DataFrames (#1915)
  • Move opcodes out of protobuf definition (#1947)

Bug fixes

  • Fix rechunk when input tileable has unknown shape (#1914)
  • Fix KeyError when comparing series (#1921)
  • Fix rechunk when chunks have different dtypes that cannot compare (#1926)
  • Collect available ports before running LightGBM task (#1927)
  • Fix KeyError when column pruning is applied (#1933)
  • Fix error when shuffling data in mars.learn module (#1936)
  • Fix memory estimation of StartTracker for XGBoost (#1936)
  • Fix accuracy_score for distributed execution (#1948)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.3

This is the release notes of v0.6.3. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add more functionalities for md.Index (#1864)
    • Implements {DataFrame,Series}.rename_axis (#1870)

Enhancements

  • Allow internal serialization to use JSON (#1882)
  • Optimize performance of {md.read_csv(), md.read_parquet()}.head() (#1883)
  • Optimize performance of df.sort_values().head() (#1888)
  • Support column pruning for groupby().agg() on data sources (#1889)
  • Improve named_{dataframe, series, tensor} that it's able to get more meta (#1897)

Bug fixes

  • Fix wrongly raised error: Tileable object must be executed first before being fetched (#1875)
  • Support unknown shape for mt.reshape, mt.histogram and md.DataFrame (#1876)
  • Fix stuck of threaded actor operations in gevent==20.12.0 (#1881)
  • Fix sorting string columns with None value & sorting with empty chunks (#1893)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.7.0a4

This is the release notes of v0.7.0a4. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add more functionalities for md.Index (#1860)
    • Implements {DataFrame,Series}.rename_axis (#1867)

Enhancements

  • Allow internal serialization to use JSON (#1880)
  • Optimize performance of {md.read_csv(), md.read_parquet()}.head() (#1878)
  • Optimize performance of df.sort_values().head() (#1884)
  • Support column pruning for groupby().agg() on data sources (#1886)
  • Improve named_{dataframe, series, tensor} that it's able to get more meta (#1896)

Bug fixes

  • Support unknown shape for mt.reshape, mt.histogram and md.DataFrame (#1869)
  • Fix wrongly raised error: Tileable object must be executed first before being fetched (#1872)
  • Fix reshape when input tensor has unknown shape and 1 chunk (#1874)
  • Fix stuck of threaded actor operations in gevent==20.12.0 (#1879)
  • Fix sorting string columns with None value & sorting with empty chunks (#1891)
  • Adapt vineyardhandler.py to latest vineyard. (#1887)

Documentation

  • LFAI & Data: Add required documents (#1865)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.2

This is the release notes of v0.6.2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements head() on groupby objects (#1851)
  • Learn
    • Implements mars.learn.preprocessing.{MinMaxScaler, minmax_scale}(#1858)

Enhancements

  • Improve Proxima recall_by_id computation method (#1807, thanks @rg070836rg!)
  • Revise to/from vineyard, of Tensor and DataFrame. (#1806)
  • Add memory estimation for read_parquet as well as read_csv (#1815)
  • Support using compound agg function in lambda (#1819)
  • Add incremental_index argument to reset_index which by default is False (#1842)
  • Support to_pandas in a batch way for DataFrame and Series (#1859)
  • Support specifying memory scale in kubernetes (#1861)

Bug fixes

  • Fix compatibility for scikit-learn 0.24.0 (#1820)
  • Remove unnecessary iterative tiling when predicting via XGBoost and data from/to parquet (#1821)
  • Resolve KeyError when calling delete_keys for ray backend (#1854)
  • Fix compatibility for pandas 1.2.0 (#1862)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.7.0a3

This is the release notes of v0.7.0a3. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements head() on groupby objects #1849
  • Learn
    • Implements mars.learn.preprocessing.{MinMaxScaler, minmax_scale} (#1844)

Enhancements

  • Improve Proxima recall_by_id computation method (#1805, thanks @rg070836rg!)
  • Revise to/from vineyard, for Tensor and DataFrame. (#1790)
  • Add memory estimation for read_parquet as well as read_csv (#1811)
  • Support using compound agg function in lambda (#1810)
  • Add incremental_index argument to reset_index which by default is False (#1823)
  • Refine kubedl cluster-api. (#1827, thanks @SimonCqk!)
  • Enhancements for Mars on kubedl (#1848)
  • Support to_pandas in a batch way for DataFrame and Series (#1853)
  • Support specifying memory limit scale in kubernetes (#1856)
  • Set marsjob worker cache by memoryTuningPolicy. (#1857, thanks @SimonCqk!)

Bug fixes

  • Fix compatibility for sklearn 0.24.0 (#1817)
  • Remove unnecessary iterative tiling when predicting via XGBoost and data from/to parquet (#1818)
  • Resolve KeyError when calling delete_keys for ray backend (#1846)
  • Fix compatibility for pandas 1.2.0 (#1847)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.1

This is the release notes of v0.6.1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Support missing argument for tensor.tosparse() and fill_value argument for sparse_tensor.todense() (#1802)
  • DataFrame
    • Implements {DataFrame,Series}.replace (#1765)
    • Add {DataFrame, Series}.cartesian_chunk support (#1777)
    • Integrate str.cat into reduction and groupby-aggregation (#1781)
    • Implements reduction with level argument (#1784)

Bug fixes

  • Spawn serialization of executable graphs (#1770)
  • Fix getitem on DataFrames with unknown index (#1778)
  • Fix reading partitioned parquet files in HDFS (#1783)
  • Fix creating Mars Series from empty pandas Series (#1788)
  • Fix bug that explicit execute may be required for to_parquet and XGB predict (#1800)
  • Support md.concat on DataFrame and Series (#1801)
  • Fix TypeError when timeout argument is absent when starting Mars cluster in YARN (#1804, thanks @smartguo!)

Documentation

  • Fill docs for apply and transform (#1767)

Tests

  • Create different test workflows & fix accessor docs (#1804)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.7.0a2

This is the release notes of v0.7.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Support missing argument for tensor.tosparse() and fill_value argument for sparse_tensor.todense() (#1797)
  • DataFrame
    • Implements {DataFrame,Series}.replace (#1762)
    • Add {DataFrame, Series}.cartesian_chunk support (#1774)
    • Integrate str.cat into reduction and groupby-aggregation (#1776)
    • Implements reduction with level argument (#1779)

Bug fixes

  • Spawn serialization of executable graphs (#1769)
  • Fix getitem on DataFrames with unknown index (#1772)
  • Fix reading partitioned parquet files in HDFS (#1782)
  • Fix creating Mars Series from empty pandas Series (#1787)
  • Support md.concat on DataFrame and Series (#1798)
  • Fix bug that explicit execute may be required for to_parquet and XGB predict (#1794)
  • Fix TypeError when timeout argument is absent when starting Mars cluster in YARN (#1803, thanks @smartguo!)

Documentation

  • Fill docs for apply and transform (#1764)

Tests

  • Create different test workflows & fix accessor docs (#1799)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.7.0a1

This is the release notes of v0.7.0a1. See here for the complete list of solved issues and merged PRs.

Changes that break compatibility

  • Aggregations and Groupby with aggregations have been rewritten in v0.6.0, older client may raise error when connecting to cluster with new version installed.

Highlights

  • Statsmodels as well as joblib are preliminarily supported.

New Features

  • DataFrame
    • Support num_partitions argument for DataFrame initializers (#1729)
    • Add support for named aggregations (#1747)
  • Tensor
    • Add rebalance method for tensors (#1731)
  • Learn
    • Add preliminary statsmodels support (#1735)
    • Add preliminary joblib support (#1757)

Bug fixes

  • Fix md.read_csv when names and usecols specified (#1737)
  • Make PSRS chunks more balanced (#1742)
  • Support string dtype for tensor reductions (#1745)
  • Fix xgboost and lightgbm on DataFrames (#1750)
  • Fix repeated execution of same code in distributed mode (#1749)
  • Support setting scalar which is a tensor for DataFrame (#1755)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.0

This is the release notes of v0.6.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.6.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:

alpha1 alpha2 alpha3 beta1 beta2 rc1

Changes that break compatibility

  • Aggregations and Groupby with aggregations have been rewritten in v0.6.0, older client may raise error when connecting to cluster with new version installed.

New Features

  • DataFrame
    • Support num_partitions argument for DataFrame initializers (#1733)
    • Add support for named aggregations (#1748)

Enhancements

  • Unify groupby.agg() using ReductionCompiler (#1739)

Bug fixes

  • Fix md.read_csv when names and usecols specified (#1738)
  • Support string dtype for tensor reductions & balance PSRS chunks (#1746)
  • Fix XGBoost and LightGBM on DataFrames (#1751)
  • Fix repeated execution of same code in distributed mode (#1753)
  • Support setting scalar which is a tensor for DataFrame (#1758)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.0rc1

This is the release notes of v0.6.0rc1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements {DataFrame,Series}.explode (#1714)
  • Learn
    • Support predicting on local LGBM models (#1716)

Enhancements

  • Add configuration page on Mars Web (#1697)
  • Add shared limit option for Mars worker (#1702)
  • Remount the shm directory in entrypoint.sh (#1700)
  • Add pure-dependent option for operands (#1706)
  • Remove prepare_inputs property on operands (#1709)
  • Use ReductionCompiler to support function aggregation in mars.dataframe.reduction (#1705)
  • Write into and read from merged files when data sizes are small (#1708)
  • Refactor builder and searcher of Proxima (#1710, thanks @rg070836rg!)

Bug fixes

  • Fix mars not working on ray cluster (#1712, thanks @fyrestone!)
  • Fix inferring dtype for series.map (#1722)
  • Fix sort functions of DataFrames on CUDA (#1723)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.5.5

This is the release notes of v0.5.5. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements {DataFrame,Series}.explode (#1715)

Enhancements

  • Add configuration page on Mars Web (#1701)
  • Remount /dev/shm directory in entrypoint.sh in Kubernetes and limit plasma size to avoid SIGBUS (#1703)
  • Add pure-dependent option for operands (#1707)

Bug fixes

  • Fix the KeyError in estimate_fuse_size (#1699)
  • Fix inferring dtype for series.map (#1724)
  • Fix sort functions in CUDA (#1725)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.5.4

This is the release notes of v0.5.4. See here for the complete list of solved issues and merged PRs.

Enhancements

  • Support inplace parameter in reset_index method (#1663)
  • Add a threshold for DataFrame.head optimization (#1679)

Bug fixes

  • Check unknown shape chunks in tile of md.concat (#1656)
  • Fix hang for rerun DataFrame.groupby in distributed mode (#1669)
  • Create Fetch operands given output types (#1668)
  • Modify df.copy() so that it generates the identical key (#1678)
  • Fix IndexError when binary op on Series whose type is datetime (#1680)
  • Mount /dev/shm on host to pods when starting Mars workers in Kubernetes (#1681)
  • Fix DataFrame reduction that output type consistent for map and combine phase (#1686)
  • Fix wrong dtypes of DataFrame setitem chunks (#1691)
  • Fix assigning operands with expected workers (#1693)
  • Add timeout for SharedHolderActor creation (#1692)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.0b2

This is the release notes of v0.6.0b2. See here for the complete list of solved issues and merged PRs.

New Features

  • Learn
    • Support recall computation for proxima (#1657, thanks @rg070836rg!)

Enhancements

  • Support inplace parameter in reset_index method (#1662)
  • Add a threshold for DataFrame.head optimization (#1673)

Bug fixes

  • Check unknown shape chunks in tile of md.concat (#1655)
  • Create Fetch operands given output types (#1666)
  • Fix hang for rerun DataFrame.groupby in distributed mode (#1667)
  • Modify df.copy() so that it generates the identical key (#1671)
  • Fix IndexError when binary op on Series whose type is datetime (#1675)
  • Mount /dev/shm on host to pods when starting Mars workers in Kubernetes (#1677)
  • Fix Series reduction that output type consistent for map and combine phase (#1685)
  • Fix wrong dtypes of DataFrame setitem chunks (#1690)
  • Add timeout for SharedHolderActor creation (#1684)
  • Fix assigning operands with expected workers (#1689)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.5.3

This is the release notes of v0.5.3. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add DataFrame.to_parquet support (#1653)

Enhancements

  • Optimize memory usage for brute-force algorithm in NearestNeighbors (#1648)

Bug fixes

  • Fix the wrong dtypes of DataFrameSetitem's inputs (#1627)
  • Fix issue that output_type does not take effect for df.apply (#1628)
  • Fix registration for DataFrameSetLabel operand(#1633)
  • Eliminate TimeoutError when there are running nodes (#1639)
  • Fix issue that serialization of transpose failed when input has unknown shape (#1638)
  • Fix PSRS error when chunks has fewer rows than partition number (#1644)
  • Fix md.concat which may occupy huge amount of memory on client when all of DataFrames own large RangeIndex (#1651)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.0b1

This is the release notes of v0.6.0b1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add DataFrame.to_parquet support (#1652)

Enhancements

  • Optimize memory usage for brute-force algorithm in NearestNeighbors (#1640)
  • Structural adjustment for proxima (#1624, thanks @rg070836rg!)

Bug fixes

  • Fix the wrong dtypes of DataFrameSetitem's inputs (#1623)
  • Fix issue that output_type does not take effect for df.apply (#1626)
  • Fix registration for DataFrameSetLabel operand (#1631)
  • Fix issue that serialization of transpose failed when input has unknown shape (#1632)
  • Eliminate TimeoutError when there are running nodes (#1637)
  • Fix PSRS error when chunks has fewer rows than partition number (#1642)
  • Add flush method to _LogWrapper (#1646)
  • Fix md.concat which may occupy huge amount of memory on client when all of DataFrames own large RangeIndex (#1649)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.6.0a3

This is the release notes of v0.6.0a3. See here for the complete list of solved issues and merged PRs.

Highlights

  • Brand-new API fetch_log is implemented so that in a distributed environment, it helps users to fetch logs which output in custom functions without effort on client side. For more details, refer to #1564 .

New Features

  • DataFrame
    • Implements df.rebalance() (#1572)
    • Add support for {DataFrame,Series}.{where,mask} (#1577)
    • Add read_parquet support (#1576)
    • Added DataFrame.isin support (#1584)
    • Implements DataFrame.stack (#1591)
    • Implements {DataFrame,Series,GroupBy}.{all,any} (#1600)
    • Add support for pearson coefficients (corr, corrwith and autocorr) (#1587)
  • Learn
    • Integrate with pyproxima2 (#1618, thanks @rg070836rg!)
  • Deployment
    • Support rescaling worker numbers in Kubernetes (#1571)
  • Others
    • Implements fetch_log API (#1574)

Bug fixes

  • Fix the failure when fetching the result of Series.sum (#1583)
  • Fix the failure of DataFrame reduction operators (#1589)
  • Fix error on fitting LGBMModel twice (#1598)
  • Fix train_test_split when some input is Series (#1610)
  • Fix build_faiss_index when some index type cannot be merged (#1609)
  • Allow LightGBM wrapper to use numpy arrays (#1607)
  • Add an extra sort key in PSRS to make distinct pivot (#1612)
  • Fixes md.read_csv when dtypes is not inferred correctly (#1606)
  • Fix Ray 1.0 compatibility (#1620)

Documentation

  • Add docs about reading data from HDFS (#1619)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.5.2

This is the release notes of v0.5.2. See here for the complete list of solved issues and merged PRs.

Highlights

  • Brand-new API fetch_log is implemented so that in a distributed environment, it helps users to fetch logs which output in custom functions without effort on client side. For more details, refer to #1564 .

New Features

  • DataFrame
    • Implements df.rebalance() (#1573)
    • Add support for {DataFrame,Series}.{where,mask} (#1579)
    • Add read_parquet support (#1581)
    • Add DataFrame.isin support (#1592)
    • Implements DataFrame.stack (#1594)
    • Add support for {DataFrame,Series,GroupBy}.{all,any} (#1601)
    • Add support for pearson coefficients (corr, corrwith and autocorr) (#1616)
  • Others
    • Implements fetch_log API (#1582)

Bug fixes

  • Fix the failure when fetching the result of Series.sum (#1585)
  • Fix failures of DataFrame reduction operators (#1595)
  • Fix error on fitting LGBMModel twice (#1599)
  • Add extra sort key in PSRS to make distinct pivot (#1613)
  • Fix build_faiss_index when some index type cannot be merged (#1614)
  • Fix train_test_split when some input is Series (#1615)
  • Fixes md.read_csv when dtypes is not inferred correctly (#1617)

- Python
Published by qinxuye over 5 years ago

https://github.com/mars-project/mars - v0.5.1

This is the release notes of v0.5.1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Allow submitting Mars jobs in custom functions (#1560)
    • Add df.select_dtypes support (#1568, thanks @lipengsh!)
    • Add df.map_chunk support (#1570)

Enhancements

  • Add configurable label options in Kubernetes cluster (#1550)

Bug fixes

  • Use relative paths to avoid web rendering issues under backward proxies (#1545)
  • Allow returning None when using groupby.apply (#1548)
  • Fix bug that cannot pass numpy array to mt.swapaxes (#1561, thanks @YoshieraHuang!)
  • Fix pandas 1.1.2 compatibility (#1563)
  • Fix compatibility for tsfresh 0.17.0 (#1567)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.6.0a2

This is the release notes of v0.6.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Allow submitting Mars jobs in custom functions (#1559)
    • Add df.select_dtypes support (#1565, thanks @lipengsh!)
    • Add df.map_chunk support (#1569)
  • Others
    • Support running Mars under KubeDL (#1549)

Enhancements

  • Add configurable label options in Kubernetes cluster (#1547)

Bug fixes

  • Fix md.read_csv for Ray executor (#1541)
  • Allow returning None when using groupby.apply (#1544)
  • Use relative paths to avoid web rendering issues under backward proxies (#1540)
  • Fix bug that cannot pass numpy array to mt.swapaxes (#1553, thanks @YoshieraHuang!)
  • Fix pandas 1.1.2 compatibility (#1562)
  • Fix compatibility for tsfresh 0.17.0 (#1566)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.5.0

This is the release notes of v0.5.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.5.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:

New Features

  • DataFrame
    • Support use_arrow_dtype for md.read_csv and md.read_sql (#1495)
  • Others
    • Store and query graph information in batch (#1504)

Enhancements

  • Fix compatibility for gevent>=20.5.1 (#1493)
  • Add an option to control writing shuffle data into disk (#1516)
  • Unify logic of modes including eager, kernel and build (#1530)
  • Optimize mars.learn.cluster.KMeans when n_clusters is relatively large (#1536)

Bug fixes

  • Update mt.split to support list and tuple (#1509, thanks @YoshieraHuang!)
  • Fix pandas 1.1 compatibility (#1515)
  • Fix mt.isclose when some of the arguments is scalar (#1518)
  • Fix mt.linalg.norm when axis is negative (#1519, thanks @YoshieraHuang!)
  • Fix arctan2 when arguments contains scalar (#1520)
  • Unregister scheduler observer when destroying actors (#1526)
  • Fix creating Mars DataFrame from an empty pandas DataFrame (#1531)
  • Support df.groupby().count() for arrow dtype with and without pyarrow installed (#1532)
  • Fix DataFrame reduction on GPU (#1535)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.4.7

This is the release notes of v0.4.7. See here for the complete list of solved issues and merged PRs.

New Features

  • Store and query graph info in batch (#1503)

Enhancements

  • Fix compatibility for gevent>=20.5.1 (#1494)
  • Add an option to control writing shuffle data into disk (#1527)

Bug fixes

  • Fix arrow_array_to_objects when input is a Series whose index is not RangeIndex(n) (#1496)

Tests

  • Fixed statsmodel version (#1537)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.6.0a1

This is the release notes of v0.6.0a1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Support use_arrow_dtype for md.read_csv and md.read_sql (#1491)
  • Others
    • Store and query graph information in batch (#1501)
    • Integrate with Ray (#1508)

Enhancements

  • Fix compatibility for gevent>=20.5.1 (#1490)
  • Add an option to control writing shuffle data into disk (#1513)
  • Unify logics of modes including eager, kernel and build (#1528)
  • Optimize mars.learn.cluster.KMeans when n_clusters is relatively large (#1511)

Bug fixes

  • Fix mt.linalg.norm when axis is negative (#1499, thanks @YoshieraHuang!)
  • Fix mt.isclose when some of the arguments is scalar (#1498)
  • Fix mt.arctan2 when arguments contain scalar (#1502)
  • Update mt.split to support list and tuple (#1507, thanks @YoshieraHuang!)
  • Fix pandas 1.1 compatibility (#1437)
  • Fix creating Mars DataFrame from an empty pandas DataFrame (#1522)
  • Unregister scheduler observer when destroying actors (#1525)
  • Support df.groupby().count() for arrow dtype with and without pyarrow installed (#1523)
  • Fix DataFrame reduction on GPU (#1534)

Documentation

  • Fix URL of contribution guide (#1505, thanks @StevenJokes!)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.4.6

This is the release notes of v0.4.6. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements md.qcut (#1473)
    • Implements {DataFrame, Series}.reindex (#1483)
    • Add support for ArrowListDtype as well as ArrowListArray (#1487)

Enhancements

  • Serialize results in worker before storing into shared storages (#1474)
  • Raise timeout when assigning failed for a long time (#1477)
  • Fix pickling arrow types & allow specifying parallel number in IO runners (#1482)

Bug fixes

  • Support ExtensionDtype in df.astype and complex serialization (#1464)
  • Fix incorrect index_value in df.drop() (#1488)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.5.0rc1

This is the release notes of v0.5.0rc1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implements md.qcut (#1468)
    • Implements {DataFrame, Series}.reindex (#1481)
    • Add support for ArrowListDtype as well as ArrowListArray (#1486)

Enhancements

  • Serialize results in worker before storing into shared storages (#1470)
  • Raise timeout when assigning failed for a long time (#1475)
  • Use f-string to replace most of string formattings (#1484)

Bug fixes

  • Fix reference cycle in promise.all_ (#1452)
  • Support ArrowStringDtype for DataFrame.sort_values() (#1455)
  • Support serializing complex scalars (#1459)
  • Support ExtensionDtype in df.astype (#1462)
  • Fix incorrect index_value in df.drop() (#1466)
  • Fix pickling arrow types & allow specifying parallel number in IO runners (#1480)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.4.5

This is the release notes of v0.4.5. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for arrow-based string dtype (#1440)
    • Add support for memory usage (#1447)

Bug fixes

  • Fix failed when serializing LearnShuffle operand. (#1449)
  • Fix reference cycle in promise.all_ (#1456)
  • Fix kmeans hang for local cluster (#1446)
  • Support ArrowStringDtype for DataFrame.sort_values() (#1457)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.5.0b3

This is the release notes of v0.5.0b3. See here for the complete list of solved issues and merged PRs.

Announcements

  • From v0.5.0b3 on, v0.5.x series will no longer support Python 3.5, for Python 3.5 users, please use 0.4.x series.

New Features

  • DataFrame:
    • Add support for arrow-based string dtype (#1438)
    • Support memory_usage on DataFrame objects (#1217)

Bug fixes

  • Fix crash when storing data inside Docker containers (#1429)
  • Fix kmeans hang for local cluster (#1445)
  • Fix failed when serializing LearnShuffle operand. (#1442)

Installation

  • Drop support for python 3.5 (#1435)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.4.4

This is the release notes of v0.4.4. See here for the complete list of solved issues and merged PRs.

New Features

  • Learn
    • Add mars.learn.cluster.KMeans support (#1428)

Enhancements

  • Optimize to_pandas and to_numpy etc that fetch first, if failed, call execute().fetch() instead (#1410)
  • Create backup CalcActor when spawning a new graph in mars.remote (#1412)
  • Skip rechunk when DataFrame has unknown shape in sort_values (#1420)

Bug fixes

  • Fix worker assign when no evaluation sets specified in LGBM training (#1408)
  • Fix query alias & add estimation for object types (#1417)
  • Fix the dtype of LightGBM model's predicted results (#1421)
  • Fix the error raised when inferring dtype in DataFrame.transform (#1427)
  • Fix crash when storing data inside Docker containers (#1432)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.5.0b2

This is the release notes of v0.5.0b2. See here for the complete list of solved issues and merged PRs.

New Features

  • Learn
    • Add mars.learn.cluster.KMeans support (#1426)

Enhancements

  • Optimize to_pandas and to_numpy etc that fetch first, if failed, call execute().fetch() instead (#1409)
  • Create backup CalcActor when spawning a new graph in mars.remote (#1411)
  • Skip rechunk when DataFrame has unknown shape in sort_values. (#1414)

Bug fixes

  • Fix worker assign when no evaluation sets specified in LGBM training (#1405)
  • Fix query alias & add estimation for object types (#1416)
  • Fix the dtype of LightGBM model's predicted results. (#1419)
  • Fix the error raised when inferring dtype in DataFrame.transform (#1424)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.4.3

This is the release notes of v0.4.3. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mars.tensor.stats.entropy (#1378)
  • DataFrame
    • Implements {DataFrame,Series,Index}.rename (#1366)
    • Implement DataFrame.insert (#1392)
  • Learn
    • Implements mars.learn.model_selection.train_test_split (#1355)
  • Remote
    • Add run_script support (#1397)

Enhancements

  • Optimize DataFrame.{head, tail} when DataFrame has unknown chunk shape (#1360)
  • Make creation of Kubernetes clusters modular (#1373)
  • Optimize read_sql + head (#1379)
  • Optimize read_csv if followed by DataFrame.getitem (#1398)

Bug fixes

  • Remove reliance on WHERE 1=0 in read_sql (#1353)
  • Fix hang for distributed roc_curve (#1367, #1387)
  • Fix read_sql when no data selected & refine error when no worker attached (#1374)
  • Fix progress display for bokeh 2.1.x (#1383)
  • Fix serialize failed when FetchDataFrame's object_type is a list (#1386)
  • Make local filesystem work when PyArrow not installed (#1391)
  • Fix serialization issue when remote function has executed tileable arguments (#1400)
  • Fix LightGBM when input tileables have unknown shape (#1399)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.5.0b1

This is the release notes of v0.5.0b1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mars.tensor.stats.entropy (#1376)
  • DataFrame
    • Implements DataFrame.rename (#1359)
    • Implements {Series,Index}.rename (#1361)
    • Implement DataFrame.insert (#1389)
  • Remote:
    • Add run_script support (#1299)

Enhancements

  • Set output type when calling new_xxx methods on DataFrames (#1212)
  • Optimize DataFrame.{head, tail} when DataFrame has unknown chunk shape (#1328)
  • Make creation of Kubernetes clusters modular (#1369)
  • Optimize read_sql + head (#1377)
  • Use subgraph to represent fused nodes instead of a list (#1388)
  • Optimize read_csv if followed by DataFrame.getitem (#1390)

Bug fixes

  • Fix hang for distributed roc_curve (#1362, #1380)
  • Fix read_sql when no data selected & refine error when no worker attached (#1371)
  • Fix progress display for bokeh 2.1.x (#1382)
  • Fix serialization issue when remote function has argument which is an executed tileable (#1394)
  • Fix LightGBM when input tileables have unknown shape (#1396)

Installation

  • Specifying encoding for long_description (#1402)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.5.0a3

This is the release notes of v0.5.0a3. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for {DataFrame,Series,Index}.drop (#1263)
    • Add {DataFrame,Series}.tosql() and Series.tocsv() (#1264)
    • Implements {DataFrame,Series,Index}.drop_duplicates (#1285)
    • Implements DataFrame.melt (#1284)
    • Implements md.read_sql_query (#1297)
    • Implements {Series,Index}.toframe() and Index.toseries() (#1317)
    • Support setting columns for DataFrame (#1326)
  • Learn
    • Add MarsDistributor for tsfresh library (#1277)
    • Implements mars.learn.model_selection.train_test_split (#1352)
  • Remote
    • Support tileables as arguments for spawned functions (#1296)

Enhancements

  • Allow client-side to use pickle to serialize / deserialize tensor data (#1289)
  • Support create session from environment variables (#1265)

Bug fixes

  • Fix NearestNeighbors that run failed in cluster mode (#1262)
  • Fix graph hang on tile failure and execution failure (#1272)
  • Fix failure when executing None-result spawn functions (#1276)
  • Fix shape calculation in TensorIndex for tensor.__setitem__ (#1283)
  • Support fuse for Mars Remote (#1287)
  • Fix mt.linalg.norm when chunk shape on axis > 1 (#1302)
  • Fix error in calc_data_size() for GroupByWrapper (#1307)
  • Trigger execution in check_consistent_length when arrays have unknown shape (#1321)
  • Fix wrong columns value in reset_index (#1320)
  • Fix build_df when input DataFrame has duplicate columns (#1319)
  • Remove reliance on WHERE 1=0 in read_sql (#1335)
  • Make local filesystem work when PyArrow not installed (#1356)

Documentation

  • Add docs for remote API, getting started as well as GPU integration (#1266)
  • Use pydata-sphinx-theme for documentation (#1304)

Others

  • Use latest pandas wheel for Python 3.8 (#1333)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.4.2

This is the release notes of v0.4.2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for {DataFrame,Series,Index}.drop (#1268)
    • Add {DataFrame,Series}.tosql() and Series.tocsv() (#1267)
    • Implements {DataFrame,Series,Index}.drop_duplicates (#1292)
    • Implement DataFrame.melt (#1295)
    • Implements md.read_sql_query (#1300)
    • Implements {Series,Index}.toframe() and Index.toseries() (#1323)
    • Support setting columns for DataFrame (#1327)
  • Learn
    • Add MarsDistributor for tsfresh library (#1281)
  • Remote
    • Support tileables as arguments for spawned functions (#1298)

Enhancements

  • Allow client-side to use pickle to serialize / deserialize tensor data (#1291)
  • Support create session from environment variables (#1322)

Bug fixes

  • Fix NearestNeighbors that run failed in cluster mode (#1273)
  • Fix graph hang on tile failure and execution failure (#1275)
  • Fix failure for None-result spawn functions (#1280)
  • Fix shape calculation in TensorIndex for tensor.__setitem__ (#1293)
  • Support fuse for Mars Remote (#1294)
  • Fix mt.linalg.norm when chunk shape on axis > 1 (#1303)
  • Trigger execution in check_consistent_length when arrays have unknown shape (#1325)
  • Fix build_df when input DataFrame has duplicate columns (#1324)
  • Fix error in calc_data_size() for GroupByWrapper (#1329)
  • Fix wrong columns value in reset_index (#1330)

Documentation

  • Add docs for remote API, getting started as well as GPU integration (#1274)

Others

  • Use latest pandas wheel for Python 3.8 (#1332)

- Python
Published by qinxuye almost 6 years ago

https://github.com/mars-project/mars - v0.4.1

This is the release notes of v0.4.1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add size function for dataframes and groupbys (#1253)
    • Implements DataFrame.{iterrows, itertuples} (#1258)
  • Learn
    • Add support for LighGBM in Mars (#1254)
  • Remote
    • Support running tileables inside functions which spawned via mr.spawn (#1257)

Bug fixes

  • Fix .fetch() that may cause some op executed again (#1255)
  • Fix df.describe() that failed when df has unknown shape and chunk size > 1 (#1256)

Tests

  • Add checks for data consistency in learn module (#1259)

- Python
Published by qinxuye about 6 years ago

https://github.com/mars-project/mars - v0.5.0a2

This is the release notes of v0.5.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add size function for dataframes and groupbys (#1250)
    • Implements DataFrame.{iterrows, itertuples} (#1252)
  • Learn
    • Add support for LightGBM in Mars (#1244)
  • Remote
    • Support running tileables inside functions which spawned via mr.spawn (#1248)

Bug fixes

  • Fix .fetch() that may cause some op executed again (#1243)
  • Fix df.describe() that failed when df has unknown shape and chunk size > 1 (#1249)

Tests

  • Add checks for data consistency in learn module (#1246)

- Python
Published by qinxuye about 6 years ago

https://github.com/mars-project/mars - v0.4.0

This is the release notes of v0.4.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.4.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:

Changes that break compatibility

  • Calling .execute() will no longer return numpy ndarray, pandas DataFrame and so forth, but will return Mars tensor, DataFrame itself instead. Only corner data will be fetched for display purpose. In order to explicitly convert to numpy ndarray, please call .to_numpy(), at the same time, call .to_pandas() to convert to pandas DataFrame. For more details, please refer to #1201.

Highlights

  • Remote API is introduced and preliminarily supported in #1239, for more details, refer to proposal #1227.

New Features

  • Tensor
    • Implements mt.trapz (#1223)
  • DataFrame
    • Add support of {DataFrame,Series}.ewm (#1198)
    • Add dataframe.unique support (#1225)
    • Implements md.to_datetime, support __setitem__ for DataFrame as well (#1226)
    • Add support for Series.astype and DataFrame.astype (#1237)
  • Learn
    • Support Mars Series in PyTorch Dataset (#1194)
    • Implements mars.learn.metrics.{roc_curve, auc} (#1233)
  • Others
    • Add preliminary remote function support (#1239)

Enhancements

  • Tileable.execute() now will return Tileable itself, repr will act correctly (#1202)
  • Rename LocalClusterSession to ClusterSession (#1236)

Bug fixes

  • Fix serialization for mars.learn.utils.shuffle (#1193)
  • Fix error in starting local cluster with IPython & latest gevent version (#1234)
  • Fix wrong result of column pruning (#1235)

- Python
Published by qinxuye about 6 years ago

https://github.com/mars-project/mars - v0.5.0a1

This is the release notes of v0.5.0a1. See here for the complete list of solved issues and merged PRs.

Changes that break compatibility

  • Calling .execute() will no longer return numpy ndarray, pandas DataFrame and so forth, but will return Mars tensor, DataFrame itself instead. Only corner data will be fetched for display purpose. In order to explicitly convert to numpy ndarray, please call .to_numpy(), at the same time, call .to_pandas() to convert to pandas DataFrame. For more details, please refer to #1201.

Highlights

  • Remote API is introduced and preliminarily supported in #1238, for more details, refer to proposal #1227.
  • Running on Yarn is preliminarily supported in #1210.

New Features

  • Tensor
    • Implements mt.trapz (#1205)
  • DataFrame
    • Add support of {DataFrame,Series}.ewm (#1164)
    • Add dataframe.unique support (#1208)
    • Implements md.to_datetime, support __setitem__ for DataFrame as well (#1207)
    • Add support for Series.astype and DataFrame.astype (#1224)
  • Learn
    • Support Mars Series in PyTorch Dataset (#1190)
    • Implements mars.learn.metrics.{roc_curve, auc} (#1220)
  • Others
    • Add preliminary support for Yarn (#1210)
    • Add preliminary remote function support (#1238)

Enhancements

  • Make Tileable.execute() return tileable itself, fetching corner data only for correct repr (#1201)
  • Allow some operands to fail fast (#1229)
  • Rename LocalClusterSession to ClusterSession (#1230)

Bug fixes

  • Fix serialization for mars.learn.utils.shuffle (#1192)
  • Fix wrong result of column pruning (#1215)
  • Fix error in starting local cluster with IPython (#1232)

Documentation

  • Add learn docs (#1182)
  • Add translation for learn docs (#1183)
  • Add documentations for DataFrame arithmetic operands (#1191)
  • Add logo in readme and docs (#1213)

Tests

  • Workaround for upgraded tiledb (#1195)

- Python
Published by qinxuye about 6 years ago

https://github.com/mars-project/mars - v0.4.0rc1

This is the release notes of v0.4.0rc1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for isna, notna and __dir__ (#1125)
    • Add support for md.dropna (#1129)
    • Support groupby.__getitem__ and group by level (#1136)
    • Implement DataFrame nunique (#1137)
    • Implements md.cut (#1139)
    • Add plot and relative functions for DataFrame and Series (#1143)
    • Implements {DataFrame, Series}.{shift, tshift} (#1157)
    • Add support of md.expanding (#1160)
    • Implements {DataFrame,Series}.diff (#1174)
    • Support modulo operand for DataFrame (#1176)
    • Add Series.value_counts() support (#1181)
  • Tensor
    • Add support for mt.union1d (#1147)
    • Support Tensor.__setitem__ with bool indexing (#1159)
  • Learn
    • Add support for NearestNeighbors.kneighbors_graph (#1152)
    • Add support for mars.learn.metrics.accuracy_score (#1150)
    • Implements mars.learn.metrics.pairwise.rbf_kernel (#1158)
    • Implements mars.learn.semi_supervised.LabelPropagation (#1163)

Enhancements

  • Refactor GroupBy objects (#1127)

Bug fixes

  • Support md.merge when on column is in df.index (#1132)
  • Fix tokenizing partial function (#1149)
  • Allow retrieving shape of a groupby object (#1155)

Documentation

  • Add DataFrame docs (#1130)
  • Fix requirements for doc (#1135)
  • Fix rendering numpy-style documentations (#1179)
  • Fix some mistakes in the doc. (#1161, thanks @ueshin!)

Tests

  • Check if tileable.nsplits and chunk.shape is consistent (#1108)
  • Add meta checks for groupby (#1144)
  • Allow using pyarrow==0.17.0 (#1172)

- Python
Published by qinxuye about 6 years ago

https://github.com/mars-project/mars - v0.3.4

This is the release notes of v0.3.4. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for isna, notna and __dir__ (#1126)
    • Add support for {DataFrame,Series}.agg (#1128)
    • Add support for md.dropna (#1131)
    • Implements {DataFrame, Series}.{shift, tshift} (#1168)
    • Add plot and relative functions for DataFrame and Series (#1166)
    • Implement DataFrame nunique (#1170)
    • Implements {DataFrame,Series}.diff (#1177)
    • Support modulo operand for DataFrame (#1180)
  • Tensor
    • Add support for mt.union1d (#1167)
    • Support Tensor.__setitem__ with bool indexing (#1169)

Bug fixes

  • Support md.merge when on column is in df.index (#1165)

Tests

  • Check if tileable.nsplits and chunk.shape is consistent (#1133)
  • Allow pyarrow to use 0.17.0 (#1173)

- Python
Published by qinxuye about 6 years ago

https://github.com/mars-project/mars - v0.3.3

New Features

  • Implements at and iat for DataFrame (#1105)
  • Implements Series.isin (for Series type). (#1106)

Enhancements

  • Optimize performance of executor when running ops less than number of parallelism (#1099)

Bug fixes

  • Fix validate_axis when input tileable has unknown shape (#1092)
  • Support creating DataFrame from dict in which scalar exists (#1104)
  • Support slice that can be integer or other types on non-int64 index (#1109)

Tests

  • Check metadata consistency for output chunks and tileables (#1094)

- Python
Published by qinxuye about 6 years ago

https://github.com/mars-project/mars - v0.4.0b2

This is the release notes of v0.4.0b2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Support calling df.agg() with lists or dicts for transform (#1093)
    • Implements at and iat for DataFrame (#1101)
    • Implements Series.isin (for Series type). (#1058)

Enhancements

  • Optimize performance of executor when running ops less than number of parallelism (#1096)

Bug fixes

  • Fix validate_axis when input tileable has unknown shape (#1091)
  • Support creating DataFrame from dict in which scalar exists (#1098)
  • Support slice that can be integer or other types on non-int64 index (#1103)

Tests

  • Check metadata consistency for output chunks and tileables (#1071)

- Python
Published by qinxuye about 6 years ago

https://github.com/mars-project/mars - v0.3.2

This is the release notes of v0.3.2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implement md.{cummax, cummin, cumprod, cumsum} (#1022)
    • Add support for md.fillna (#1031)
    • Add DataFrame.loc support (#1060)
    • Add DataFrame.rolling support (#1061)
    • Add support for GroupBy.{cumcount, cummin, cummax, cumprod, cumsum} (#1072)
    • Support string and datetime methods via Series.str and Series.dt accessor (#1074)
    • Implement dataframe append (#1075)
    • Implement DataFrame.concat and Series.concat (#1078)
    • Add support for DataFrame.sort_values (#1081)
    • Support sort_index for DataFrame and Series (#1082)
    • Add md.date_range support (#1086)
    • Logical operators on DataFrame and Series. (#1088)
    • Implements head/tail based on iloc, and fixes bug in getitem. (#1089)

Enhancements

  • Use mapjoin to optimize df.merge (#1023)
  • Refactor tiling of DataFrame.iloc with index_lib (#1043)
  • Add sort_range_index parameter in readcsv (#1067)

Bug fixes

  • Standardize RangeIndex for unknown shape DataFrame (#1066)
  • Fix failed cases in distributed mode (#1079)
  • Fix wrong dtypes in df.rechunk (#1083)
  • Fix consistency between tensor metadata and real outputs (#1087)

Tests

  • Fix tests under Python 3.6 as VS2015 is preinstalled (#1015)

- Python
Published by qinxuye about 6 years ago

https://github.com/mars-project/mars - v0.4.0b1

This is the release notes of v0.4.0b1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Implement md.{cummax, cummin, cumprod, cumsum} (#1019)
    • Implement dataframe append (#1026)
    • Add support for md.fillna (#1029)
    • Implement DataFrame.concat and Series.concat (#1040)
    • Support groupby.agg with list of functions (#1030)
    • Implement md.{DataFrame,Series,GroupBy}.apply (#1038)
    • Add support for DataFrame.sort_values (#1046)
    • Add DataFrame.loc support (#1042)
    • Add DataFrame.rolling support (#1045)
    • Add support for {DataFrame,Series}.agg (#1054)
    • Support string and datetime methods via Series.str and Series.dt accessor (#1063)
    • Add support for GroupBy.{cumcount, cummin, cummax, cumprod, cumsum} (#1069)
    • Support sort_index for DataFrame and Series (#1053)
    • Add md.date_range support (#1073)
    • Logical operators on DataFrame and Series. (#1056)
    • Implements head/tail based on iloc, and fixes bug in getitem. (#1057)
  • Others
    • Add support for function serialization (#1048)

Enhancements

  • Use mapjoin to optimize df.merge (#1021)
  • Add sort_range_index parameter in read_csv (#1024)
  • Refactor tiling of DataFrame.iloc with index_lib (#1016)

Bug fixes

  • Fix KNN so that it can accept input with unknown shape (#1033)
  • Support serializing pd.Timestamp and pd.Timedelta (#1065)
  • Fix failed cases in distributed mode (#1062)
  • Fix wrong dtypes in df.rechunk (#1080)
  • Fix failed fit method selection for KNN when input has unknown shape (#1050)
  • Fix consistency between tensor metadata and real outputs (#1085)

Tests

  • Fix tests under Python 3.6 as VS2015 is preinstalled (#1014)

- Python
Published by qinxuye about 6 years ago

https://github.com/mars-project/mars - v0.3.1

This is the release notes of v0.3.1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mt.{topk, argsort, argpartition, argtopk} (#991)
    • Implement imread to read from images (#997)
  • DataFrame
    • Support ufunc for Mars DataFrame (#967)
    • Implements DataFrame.to_csv (#992)
    • Implements DataFrame dot, mul and pow (#994)
    • Implement dataframe var and std (#996)
    • Implements describe for DataFrame (#998)

Enhancements

  • Refactor tensor indexing (#1012)

Bug fixes

  • Stop detecting GPU when no cuda devices are configured (#975)
  • Fix wrong behavior of choice (#993)
  • Make sure all kwargs are numpy types when inferring dtypes (#995)
  • Fix wrong result of count_nonzero (#1003)
  • Add dtype property for TensorImread (#1005)
  • Fix error when no device detected by CUDA driver (#1008)

Tests

  • Fix failures in Windows tests (#939)
  • Fix failed unittests due to release of pandas 1.0 (#965)

- Python
Published by qinxuye over 6 years ago

https://github.com/mars-project/mars - v0.4.0a2

This is the release notes of v0.4.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor:
    • Add ability to read and write HDF5 file for tensor (#962)
    • Implements mt.{topk, argsort, argpartition, argtopk} (#946)
    • Support reading and writing in zarr format (#963)
    • Implement imread to read from images (#988)
  • DataFrame
    • Support ufunc for Mars DataFrame (#957)
    • Implements DataFrame.to_csv (#966)
    • Implement dataframe var and std (#977)
    • Implements Series.map (#979)
    • Implements DataFrame dot, mul and pow (#980)
    • Implements describe for DataFrame (#981)
    • Implements md.read_sql_table (#986)
  • Learn
    • Implement PyTorch sampler to improve dataset performance (#970)
    • Support mars.learn.neighbors.NearestNeighbors (#961)
    • Leverage faiss to accelerate k-nearest neighbors calculation (#984)
    • Implement pytorch sampler for local training (#1010)

Enhancements

  • Refactor tensor indexing (#1011)

Bug fixes

  • Fix tile in nonzero that tensor instead of tensor data should be used during the process (#954)
  • Fixes cdist(x, y) that creates tensor with wrong nsplits (#960)
  • Fix the wrong RangeIndex in read_csv (#930)
  • Stop detecting GPU when no cuda devices are configured (#973)
  • Fix wrong behavior of mt.random.choice (#976)
  • Make sure all kwargs are numpy types when inferring dtypes (#987)
  • Fix error when chunk_size not provided for md.read_sql_table (#990)
  • Fix wrong result of count_nonzero (#1002)
  • Add dtype property for TensorImread (#1004)
  • Fix error when no device detected by CUDA driver (#1007)

Tests

  • Fix failed unittests due to release of pandas 1.0 (#964)
  • Hotfix opcodes that conflict (#968)

- Python
Published by qinxuye over 6 years ago

https://github.com/mars-project/mars - v0.4.0a1

This is the release notes of v0.4.0a1. See here for the complete list of solved issues and merged PRs.

Announcements

Due to the end-of-life (EOL) of Python 2 in January 1, 2020, from v0.4.0a1 on, v0.4.x series will no longer support Python 2, for Python 2.7 users, please use 0.3.x series.

Changes that break compatibility

  • Operand now supports stages(#934), reduction operands as well as those operands whose tiled chunks contain map or reduce phases cannot be serialized between this and former versions.

New Features

  • Tensor
    • Implements mt.histogram and mt.histogram_bin_edges (#876)
    • Add mt.partition support (#889)
    • Implements mt.{percentile, quantile, median} (#898)
    • Support Einstein summation convention (#888)
    • Add mt.fill_diagonal support (#918)
    • Support mars.tensor.spatial.distance.{pdist, cdist, squareform} (#894)
  • DataFrame
    • Support creating DataFrame from dict whose values are tensors (#903)
    • Support DataFrame and Series count (#900)
    • Implement mean operator for DataFrame and Series (#907)
    • Implements DataFrame.quantile and Series.quantile (#911)
    • Add comparison functions for DataFrame (#921)
    • Support df.reset_index and series.reset_index (#915)
  • Learn
    • Add pairwise distances support for learn (#926)
    • Implement MarsDataset to integrate with PyTorch (#937)
  • Others
    • Add function objects implementation for tokenizer (#893)

Enhancements

  • Use default args for super() (#878)
  • Skip preparing specified chunks when preparing for execution (#891)
  • Accelerate LU when input has one chunk (#905)
  • Add support for AnyReference in serialization (#874)
  • Merge operands representing multiple stages of one single operand (#934)

Tests

  • Add TestExecutor that serde graph every time when executing to ensure all operands work well with serialize (#880)
  • Fix possible failure of testIterativeTilingWithoutEtcd for Python 3.5 in CI (#896)
  • Switch coverage service to codecov (#909)
  • Remove *_pb2.py to reduce chances of code conflict (#913)
  • Fix failures in Windows tests (#938)

Others

  • Drop support for Python 2 (#872)
  • Further remove py27-related imports (#875)

- Python
Published by qinxuye over 6 years ago

https://github.com/mars-project/mars - v0.3.0

This is the release notes of v0.3.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.3.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:

alpha1 alpha2 beta1 beta2 rc1

Announcements

From v0.3.0 on, v0.3.x will be the last series that support Python 2 until release of v0.4.0.

Changes that break compatibility

  • Operand now supports stages(#935), reduction operands as well as those operands whose tiled chunks contain map or reduce phases cannot be serialized between this and former versions.

New Features

  • Tensor
    • Implements mt.histogram and mt.histogram_bin_edges (#914)
    • Add mt.partition support (#916)
    • Implements mt.{percentile, quantile, median} (#919)
    • Support Einstein summation convention (#925)
    • Add mt.fill_diagonal support (#931)
  • DataFrame
    • Support creating DataFrame from dict whose values are tensors (#922)
    • Implements DataFrame.quantile and Series.quantile (#924)
    • Support DataFrame and Series count (#923)
    • Implement mean operator for DataFrame and Series (#927)
    • Add comparison operands for DataFrame (#929)
    • Support df.reset_index and series.reset_index (#933)

Enhancements

  • Add public base class for entity data (#879)
  • Merge operands representing multiple stages of one single operand (#935)

Bug fixes

  • Fix sparse behavior for tensor.min and tensor.max (#936)

Tests

  • Add TestExecutor that serde graph every time when executing to ensure all operands work well with serialize (#881)
  • Fix possible failure of testIterativeTilingWithoutEtcd for Python 3 in CI (#906)
  • Remove *_pb2.py to reduce chances of code conflict (#920)

- Python
Published by qinxuye over 6 years ago

https://github.com/mars-project/mars - v0.3.0rc1

This is the release notes of 0.3.0rc1. See here for the complete list of solved issues and merged PRs.

Highlights

  • Mars now can handle more cases that failed due to tensors with unknown chunk shapes via iterative tiling support introduced in #834.
  • Python 3.8 wheels are supported in this release.

New Features

  • Support iterative tiling (#834)
  • Add experimental column pruning rules for tileable graph optimization (#865)
  • Tensor
    • Add mt.sort support (#827)
  • DataFrame
    • Support DataFrame rechunk (#839)
    • Support Series's setitem and getitem by iloc operation (#843)
    • Add tree reduction method for DataFrame groupby aggregations (#850)
  • Learn
    • Add mars.learn.datasets.samples_generator.make_blobs and update README (#845)
    • Support running PyTorch in Mars cluster via run_pytorch_script (#861)

Enhancements

  • Add ReceiverStatusActor to help listening at receiver end (#833)
  • Assign enqueued operands immediately when no descendants are ready (#854)
  • Support transferring multiple chunks at one time (#841)

Bug fixes

  • Fix incorrect behavior of dataframe arithmetic (#838)
  • Mark resource as processing once allocated (#848)
  • Fix read_csv execution on GPU (#859)
  • Kill process tree when terminating a worker process (#864)

Tests

  • Add separate environment to test HDFS (#829)
  • Add CI/CD for Python 3.8 (#857)
  • Fix distribute error under Py38 (#871)

- Python
Published by qinxuye over 6 years ago

https://github.com/mars-project/mars - v0.2.4

This is the release notes of v0.2.4. See here for the complete list of solved issues and merged PRs.

New Features

  • Add mt.sort support (#862)
  • Support DataFrame rechunk (#866)
  • Support Series's setitem and getitem by iloc operation (#868)
  • Add tree reduction method for DataFrame groupby aggregations (#869)

Enhancements

  • Backport CUDA-related changes in utils (#846)
  • Resolve compatibility issue for Python 3.8 (#858)

Bug fixes

  • Fix incorrect behavior of dataframe arithmetic (#840)
  • Mark resource as processing once allocated (#851)
  • Kill process tree when terminating a worker process (#867)
  • Fix read_csv execution on GPU (#870)

Tests

  • Add separate environment to test HDFS (#835)

- Python
Published by qinxuye over 6 years ago

https://github.com/mars-project/mars - v0.2.3

This is the release notes of v0.2.3. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor:
    • Add mt.unique support for tensor (#798)
  • DataFrame
    • Support DataFrame subtract operator (#800)
    • Support conversion between series and tensor (#806)
    • Refactor of DataFrame reduction and support more reduction operands (#816)
    • Support DataFrame read_csv (#826)

Enhancements

  • Simplify tiles logic to improve its performance (#801)
  • Return execution exception info properly to session client. (#821)
  • Support axis argument for permutation and shuffle (#822)
  • Support __iadd__ etc by wrap add with out argument (#824)

Bug fixes

  • Correct type checking for DataFrame arithmetic (#819)
  • Fix stuck issue of GeventThreadPoolExecutor (#823)

Tests

  • Switch CI service to Github Actions (#794)
  • Move tests in Appveyor into Github Actions (#797)
  • Fix etcd cases under macOS Catalina (#811)

- Python
Published by qinxuye over 6 years ago

https://github.com/mars-project/mars - v0.3.0b2

This is the release notes of v0.3.0b2. See here for the complete list of solved issues and merged PRs.

Highlights

  • Interoperability with XGBoost and TensorFlow are introduced:
    • mars.learn.contrib.xgboost.XGBClassifier and mars.learn.contrib.xgboost.XGBRegressor can be used to do distributed classification and regression mission.
    • mars.learn.contrib.tensorflow.run_tensorflow_script supports running distributed TensorFlow 2.0 training in Mars cluster.

New Features

  • Tensor
    • Add mt.unique support for tensor (#783)
  • DataFrame
    • Support DataFrame subtract operator (#787)
    • Support conversion between series and tensor (#791)
    • Refactor of DataFrame reduction and support more reduction operands (#789)
    • Support DataFrame read_csv (#807)
  • Learn
    • Add XGBoost support (#769)
    • Add ObjectData and ObjectChunk to represent data beyond ndarray, dataframe etc (#805)
    • Add mars.learn.utils.shuffle to support shuffling multiple tileable objects in a consistent way (#808)
    • Support running distributed TensorFlow 2.0 via run_tensorflow_script (#820)

Enhancements

  • Return execution exception info properly to session client (#770)
  • Simplify tiles logic to improve its performance (#792)
  • Support axis argument for permutation and shuffle (#803)
  • Support __iadd__ etc by wrap add with out argument (#813)
  • Handle worker storage in batches (#818)

Bug fixes

  • Correct type checking for DataFrame arithmetic (#815)

Tests

  • Switch CI service to Github Actions (#793)
  • Move tests in Appveyor into Github Actions (#795)

Others

  • Bump copyright year to 2020 (#809)

- Python
Published by qinxuye over 6 years ago

https://github.com/mars-project/mars - v0.2.2

This is the release notes of v0.2.2. See here for the complete list of solved issues and merged PRs.

New Features

  • Add multiple GPU support for local execution (#781)
  • Implements numpy.random.shuffle and numpy.random.permutation for tensor (#780)
  • Support DataFrame groupby.agg (#782)

Enhancements

  • Overhaul dataframe/series index alignment (#778)

Bug fixes

  • Fix execution of arithmetic on GPU (#777)

- Python
Published by qinxuye over 6 years ago

https://github.com/mars-project/mars - v0.3.0b1

This is the release notes of v0.3.0b1. See here for the complete list of solved issues and merged PRs.

New Features

  • Implements numpy.random.shuffle and numpy.random.permutation for tensor (#762)
  • Add preliminary support for distributed execution with CUDA (#776)
  • Add multiple GPU support for local execution (#779)
  • Support DataFrame groupby.agg (#767)

Enhancements

  • Overhaul dataframe/series index alignment. (#737)
  • Add support for controlling data copy across processes (#766)

Bug fixes

  • Fix relocation of plasma error objects (#771)
  • Fix execution of arithmetic on GPU (#775)

- Python
Published by qinxuye over 6 years ago

https://github.com/mars-project/mars - v0.2.1

This is the release notes of v0.2.1. See here for the complete list of solved issues and merged PRs.

New Features

  • Add to_gpu and to_cpu support for both tensor and DataFrame (#706)
  • Access column using __getattr__ syntax for DataFrame (#746)

Enhancements

  • Wait for graph to finish instead of querying with fixed intervals (#707)
  • Spawn promise to utilize async network libs (#735)
  • Submit metas obtained from schedulers (#741)
  • Submit initial operands together in one RPC call (#745)
  • Fuse some operations in cholesky's tile (#749)
  • Simplify data transfer protocol (#744)

Bug fixes

  • Separate flags for initials and terminals (#708)
  • Remove redundant RPC calls for schedulers (#709)
  • Fix incorrect chunk shape in QR (#722)
  • Use cpuacct.stat to calculate cpu usage in Docker containers (#743)
  • Processing index and columns seperately (and correctly) in from_tensor (#747)
  • __setitem__ on a view should be still a view (#748)

- Python
Published by wjsi over 6 years ago

https://github.com/mars-project/mars - v0.3.0a2

This is the release notes of v0.3.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • Add to_gpu and to_cpu support for both tensor and DataFrame (#630)
  • Access column using __getattr__ syntax for DataFrame (#712)

Enhancements

  • Move related files to optimizes module (#640)
  • Add option for plasma path (#699)
  • Wait for graph to finish instead of querying with fixed intervals (#701)
  • Submit initial operands together in one RPC call (#711)
  • Add lock free option for workers (#716)
  • Implements more flexible tileable.cix[] (#731)
  • Submit metas obtained from schedulers (#727)
  • Spawn promise to utilize async network libs (#725)
  • Simplify data transfer protocol (#736)
  • Fuse some operations in cholesky's tile (#742)

Bug fixes

  • Separate flags for initials and terminals for operands (#703)
  • Remove redundant RPC calls for schedulers (#705)
  • Fix incorrect chunk shape in QR decomposition (#719)
  • __setitem__ on a view should be still a view (#733)
  • Processing index and columns seperately (and correctly) in from_tensor (#723)
  • Add a config to use cpuacct.stat to calculate cpu usage (#740)
  • Fix race condition when starting tasks and adding callbacks (#755)

- Python
Published by wjsi over 6 years ago