Recent Releases of https://github.com/mars-project/mars
https://github.com/mars-project/mars - v0.10.0
What's Changed
- Optimize tile of DataFrame.setitem by reducing time of generating chunk meta by @qinxuye in https://github.com/mars-project/mars/pull/3140
- Increase the default value of alru cache max size by @zhongchun in https://github.com/mars-project/mars/pull/3146
- Support scipy special function with tuple output by @RandomY-2 in https://github.com/mars-project/mars/pull/3139
- Fix
DAG.to_dotwhen reducers have multiple outputs by @chaokunyang in https://github.com/mars-project/mars/pull/3150 - Fix deserializing RandomStateField when its value is None by @chaokunyang in https://github.com/mars-project/mars/pull/3149
- Patch pandas magic functions to allow reverse operands by @wjsi in https://github.com/mars-project/mars/pull/3155
- Run flaky test
test_load_third_party_modulesseparately by @chaokunyang in https://github.com/mars-project/mars/pull/3162 - Manually install cri-dockerd before installing kubernetes by @wjsi in https://github.com/mars-project/mars/pull/3166
- [Shuffle] Add
n_mappersandn_reducerstoShuffleProxyby @chaokunyang in https://github.com/mars-project/mars/pull/3160 - [Ray] task based shuffle for ray by @chaokunyang in https://github.com/mars-project/mars/pull/3040
- Add support for
{DataFrame,Series}.alignby @wjsi in https://github.com/mars-project/mars/pull/3147 - Integrate remaining error functions and fresnel integrals except
fresnel_zerosby @RandomY-2 in https://github.com/mars-project/mars/pull/3172 - Improve numexpr fusion by @fyrestone in https://github.com/mars-project/mars/pull/3177
- Ensure key is a valid Python identifier by @fyrestone in https://github.com/mars-project/mars/pull/3190
- Bump terser from 5.7.1 to 5.14.2 in web component by @dependabot in https://github.com/mars-project/mars/pull/3194
- Implement airy functions (except the
ai_zerosandbi_zerosfunctions) by @shantam-8 in https://github.com/mars-project/mars/pull/3195 - Disable version updates for dependabot by @wjsi in https://github.com/mars-project/mars/pull/3203
- [Ray] Fix ray memory leak by @fyrestone in https://github.com/mars-project/mars/pull/3184
- [Ray] Support reducer has inputs which isn't mapper by @chaokunyang in https://github.com/mars-project/mars/pull/3206
- Refine UT and logs by @fyrestone in https://github.com/mars-project/mars/pull/3204
- release actor lock when setsubtaskresult by @chaokunyang in https://github.com/mars-project/mars/pull/3210
- Refine apply key generation by @chaokunyang in https://github.com/mars-project/mars/pull/3208
- fix remove mapper data by @chaokunyang in https://github.com/mars-project/mars/pull/3214
- [Ray] Configurable subtask num_cpus by @fyrestone in https://github.com/mars-project/mars/pull/3207
- Fix versionner compatibility with PEP600 by @chaokunyang in https://github.com/mars-project/mars/pull/3223
- Support get mappers data without index/mapperids by @chaokunyang in https://github.com/mars-project/mars/pull/3222
- [Ray] RayExecutionContext.getchunkmeta from meta service by @fyrestone in https://github.com/mars-project/mars/pull/3212
- [Ray] Share RayTaskState across tasks by @fyrestone in https://github.com/mars-project/mars/pull/3219
- [Shuffle] Support shuffle operands mapper whose outputs aren't mapper blocks by @chaokunyang in https://github.com/mars-project/mars/pull/3228
- Apply Operand Closure clean up by @vcfgv in https://github.com/mars-project/mars/pull/3205
- Fix dataframe sort_values with multiple ascendings bug in pandas < 1.4 by @fyrestone in https://github.com/mars-project/mars/pull/3234
- Lifecycle gc task service by @fyrestone in https://github.com/mars-project/mars/pull/3230
- Fix dataframe loc with slice returns incorrect results by @fyrestone in https://github.com/mars-project/mars/pull/3241
- Fix dataframe setitem bugs when partial indexes exist in target dataframe by @fyrestone in https://github.com/mars-project/mars/pull/3240
- [Shuffle] isolate mappers in different subtasks for fetchbyindex mode by @chaokunyang in https://github.com/mars-project/mars/pull/3239
- TypeDispatcher support one type multiple serializers by @fyrestone in https://github.com/mars-project/mars/pull/3242
- [Shuffle] Skip store shuffle object refs to reduce meta overhead by @chaokunyang in https://github.com/mars-project/mars/pull/3209
- [ray] Support scheduling ray tasks in Ray oscar deploy backend by @chaokunyang in https://github.com/mars-project/mars/pull/3165
- Dump subtask graph for all backends by @fyrestone in https://github.com/mars-project/mars/pull/3245
- [Metrics] Fix metrics and docs by @zhongchun in https://github.com/mars-project/mars/pull/3233
- Remove storage service from supervisor by @vcfgv in https://github.com/mars-project/mars/pull/3254
- Fix optimization rule memory leak by @fyrestone in https://github.com/mars-project/mars/pull/3246
- fsspec integration by @hekaisheng in https://github.com/mars-project/mars/pull/3253
- [Ray] Enable CI of mars/dataframe for Ray DAG by @fyrestone in https://github.com/mars-project/mars/pull/3250
- Fix minikube installation by @hekaisheng in https://github.com/mars-project/mars/pull/3244
- Implements scipy.stats.rankdata by @shantam-8 in https://github.com/mars-project/mars/pull/3218
- Add S3 support by @fyrestone in https://github.com/mars-project/mars/pull/3258
- Fix tensor frexp by @fyrestone in https://github.com/mars-project/mars/pull/3259
- Optimize the display of task process bar by @zhongchun in https://github.com/mars-project/mars/pull/3264
- [Ray] Optimize ray executor submit subtask by @fyrestone in https://github.com/mars-project/mars/pull/3271
- [Ray] Enable CI of mars/learn for Ray DAG by @fyrestone in https://github.com/mars-project/mars/pull/3261
- [Ray] Enable CI of mars/tensor for Ray DAG by @fyrestone in https://github.com/mars-project/mars/pull/3275
- Compatible with pandas 1.5.0 by @hekaisheng in https://github.com/mars-project/mars/pull/3276
- Remove skipraydag mark for raydataset tests by @vcfgv in https://github.com/mars-project/mars/pull/3255
- MapChunk Operand Closure and Callable cleanup by @vcfgv in https://github.com/mars-project/mars/pull/3238
- [Ray] Spread scheduling subtasks with empty dependencies by @fyrestone in https://github.com/mars-project/mars/pull/3281
- Speedup mars deserialization by new by @chaokunyang in https://github.com/mars-project/mars/pull/3283
- A cython-based ordered_set to speedup
discardoperation by @chaokunyang in https://github.com/mars-project/mars/pull/3277 - Optimize concat by @fyrestone in https://github.com/mars-project/mars/pull/3286
- Fix
md.concaterror when there are same fetch chunk data by @zhongchun in https://github.com/mars-project/mars/pull/3285 - [Ray] Improve Ray executor GC by @fyrestone in https://github.com/mars-project/mars/pull/3287
- Fix some CI issues by @hekaisheng in https://github.com/mars-project/mars/pull/3296
- [Ray] Implement Ray executor subtask GC by @fyrestone in https://github.com/mars-project/mars/pull/3294
- [Ray] Add metrics for Ray executor by @fyrestone in https://github.com/mars-project/mars/pull/3295
- Bump up required vineyard version to address the CI failure. by @sighingnow in https://github.com/mars-project/mars/pull/3298
- [Operand] support loc setitem by @chaokunyang in https://github.com/mars-project/mars/pull/3291
- [Ray] Support worker_mem for ray executor by @fyrestone in https://github.com/mars-project/mars/pull/3300
- Fix duplicate execution by @fyrestone in https://github.com/mars-project/mars/pull/3301
- Fix CI by @hekaisheng in https://github.com/mars-project/mars/pull/3306
- [Ray] Basic slow subtask detection by @fyrestone in https://github.com/mars-project/mars/pull/3305
- Fix stats tests and pin sphinx version by @hekaisheng in https://github.com/mars-project/mars/pull/3313
- Fix s3 client kwargs by @fyrestone in https://github.com/mars-project/mars/pull/3316
- Update Mars on Ray doc by @fyrestone in https://github.com/mars-project/mars/pull/3311
Full Changelog: https://github.com/mars-project/mars/compare/v0.10.0a1...v0.10.0
- Python
Published by fyrestone over 3 years ago
https://github.com/mars-project/mars - v0.9.0
This is the release notes of v0.9.0. See here for the complete list of solved issues and merged PRs.
This release note only covers the difference from v0.9.0rc3; for all highlights and changes, please refer to the release notes of the pre-releases:
alpha1 alpha2 beta1 beta2 rc1 rc2 rc3
Changes that break compatibility
From v0.9 on, Python 3.6 is dropped support.
Highlights
- Performance is fully optimized in this version, welcome to give your feedback.
New Features
- Oscar
- Stop importing main module when starting Mars local cluster (#3113)
- Tensor
- Integrate special error functions (#3062)
- Integrate part of scipy elliptic functions and integrals (#3112)
- DataFrame
- Support sort=True for Groupby (#3063, thanks @sak2002!)
Enhancements
- Dump remote tracebacks to make local ones more friendly (#3030)
- Optimize import speed for Mars package (#3035)
- [Ray] Implement ray task executor progress (#3065)
- Shuffle both sides at the same time for
md.merge(#3066) - Refine ThreadedServiceContext.getchunksmeta usage (#3067)
- Do not aggressively choose tree method in tile of groupby for distributed setting (#3070)
- Disable bloom filter in merge for now (#3071)
- [Ray] Implements getchunksresult for Ray execution context (#3072)
- Use tell when remove mapper data after execution (#3073)
- Assign reducer ops in task assigner to make them more balanced across cluster (#3075)
- [Ray] Destroy Ray executor when the task finish (#3074)
- Combine tree and shuffle methods in
DataFrameGroupBy.aggtile (#3077) - [Ray] Implements getchunksmeta for Ray execution context (#3076)
- Use OS-designated ports instead of random ports to create sub pools (#3087)
- Call immutable web API only once when previous call blocks (#3088)
- Unify DataFrameGroupByAgg's tile logic for auto method (#3094)
- [Ray] Support basic subtask retry and lineage reconstruction (#3097)
- Simplify argument passing in actor batch calls (#3100)
- [Ray] Implements gettotaln_cpu for Ray execution context (#3104)
- Optimize performance of transfer (#3105)
- Add
n_reducersandreducer_ordinalto shuffle operands (#3107) - [Ray] Implement cancel method on Ray task executor (#3093)
- [Ray] Create RayTaskState actor as needed by default (#3114)
- [Ray] Implement gc for ray task executor context (#3116)
- Optimize serializable memory (#3126)
Bug fixes
- Patch pandas to make pickle compatible between 1.2 and 1.3 (#3050)
- Fix errors when deleting mapper data (#3064)
- Fix chunk index error in automergechunks (#3068)
- Fix recursive_tile that it may cause duplicated tile for one tileable (#3069)
- [Ray] Fix ray worker failover (#3115)
- [Ray] Fix pandas schema parsing when reading Ray dataset (#3117)
- [Ray] fix auto scale-in hang (#3125)
- [Metric] Fix prometheus metric backend (#3127)
- Fix mt.{cumsum, cumprod} when the first chunk is empty (#3136)
Tests
- Check initialization of serializables on CI (#3013)
- [Ray] Optimize Ray CI execution time and stability (#3121)
- Update pytest imports for test_special.py (#3131)
- [Ray] Fix flaky test testoptionalsupervisor_node (#3135)
Others
- Build web code before CIBW when deploying to PyPI (#3016)
- Python
Published by qinxuye about 4 years ago
https://github.com/mars-project/mars - v0.10.0a1
This is the release notes of v0.10.0a1. See here for the complete list of solved issues and merged PRs.
New Features
- Oscar
- Stop importing main module when starting Mars local cluster (#3110)
- Tensor
- Integrate special error functions (#3060)
- Integrate part of scipy elliptic functions and integrals (#3111)
- DataFrame
- Support
sort=Truefor Groupby (#2959, thanks @sak2002!)
- Support
Enhancements
- Disable bloom filter in merge for now (#2967)
- [Ray] Implement ray task executor progress (#3008)
- Dump remote tracebacks to make local ones more friendly (#3028)
- Use tell when remove mapper data after execution (#3027)
- Optimize import speed for Mars package (#3022)
- Do not aggressively choose tree method in tile of groupby for distributed setting (#3032)
- [Ray] Implements getchunksresult for Ray execution context (#3023)
- Refine ThreadedServiceContext.getchunksmeta usage (#3037)
- Shuffle both sides at the same time for
md.merge(#3041) - Assign reducer ops in task assigner to make them more balanced across cluster (#3048)
- [Ray] Destroy Ray executor when the task finish (#3049)
- [Ray] Implements getchunksmeta for Ray execution context (#3052)
- [Ray] Support basic subtask retry and lineage reconstruction (#2969)
- Combine tree and shuffle methods in
DataFrameGroupBy.aggtile (#3051) - [Ray] Implements gettotaln_cpu for Ray execution context (#3059)
- [Ray] Implement cancel method on Ray task executor (#3044)
- Use OS-designated ports instead of random ports to create sub pools (#3053)
- Unify DataFrameGroupByAgg's tile logic for auto method (#3084)
- Simplify router clean up when pools or clusters ends (#3086)
- Call immutable web API only once when previous call blocks (#3085)
- [Ray] Create RayTaskState actor as needed by default (#3081)
- [Ray] Implement gc for ray task executor context (#3061)
- Simplify argument passing in actor batch calls (#3098)
- Optimize performance of transfer (#3091)
- Add
n_reducersandreducer_ordinalto shuffle operands (#3055) - Optimize serializable memory (#3120)
Bug fixes
- Fix errors when deleting mapper data (#3018)
- Fix recursive_tile that it may cause duplicated tile for one tileable (#3021)
- Fix error message when sparse data format not supported (#3046)
- Patch pandas to make pickle compatible between 1.2 and 1.3 (#3047)
- Fix chunk index error in automergechunks (#3057)
- [Ray] Fix ray worker failover (#3080)
- [Metric] Fix prometheus metric backend (#3124)
- Fix mt.{cumsum, cumprod} when the first chunk is empty (#3134)
Tests
- Check initialization of serializables on CI (#3007)
- Use @pytest_asyncio.fixture instead of @pytest.fixture for async fixtures (#3025)
- Change code owners to Mars PMC maintainers (#3031)
- [Ray] Fix ray executor progress test (#3033)
- [Ray] Optimize Ray CI execution time and stability (#3102)
- Make testsessionset_progress more stable under Ray tests (#3103)
- Update pytest imports for test_special.py (#3129)
- [Ray] Fix flaky test
test_optional_supervisor_node(#3133)
Others
- Build web code before CIBW when deploying to PyPI (#3014)
- Make PyPI user name configurable (#3130)
- Python
Published by qinxuye about 4 years ago
https://github.com/mars-project/mars - v0.8.7
This is the release notes of v0.8.7.
Bug fixes
- Fixes missing web packages in Linux wheels (#3014)
- Python
Published by wjsi about 4 years ago
https://github.com/mars-project/mars - v0.8.6
This is the release notes of v0.8.6. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implementing Ellipsoidal Harmonics Functions (#2927, thanks @shantam-8!)
Enhancements
- Add support for
dask.persist(#2990, thanks @loopyme!) - Optimize gen subtask graph (#3006)
- Ignore broadcaster's locality when assign subtasks (#2994)
Bug fixes
- Fix task hang when error object cannot be pickled (#2913)
- Fix potential KeyError in actor_ref calls when running with multiple processes (#2962)
- Wrap errors in operand execution to protect scheduling service (#2971)
- Fix dtype of series result for
DataFrame.apply(#2979) - Fix default config to ensure storage backends configured (#2989)
- Fix potential empty chunks when creating DataFrame from pandas (#2991)
- Fix incorrect result for
df.sort_valueswhen specifying multiple ascending (#3006) - Fix missing extra_params when constructing operands (#3006)
Tests
- Fix version mismatch between kubernetes and minikube (#2988)
- Python
Published by qinxuye about 4 years ago
https://github.com/mars-project/mars - v0.9.0rc3
This is the release notes of v0.9.0rc3. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implementing Ellipsoidal Harmonics Functions (#2891, thanks @shantam-8!)
- Services
- Support worker meta service (#2909)
- Basic Ray execution backend (#2921)
Enhancements
- Add execution API to enable custimization of Mars Task Service (#2894)
- Optimize serialization performance (#2914)
- Skip adding band in meta when fetch shuffle data (#2922)
- Store complete meta on worker and update supervisor meta via fetching from workers (#2912)
- Use cython to accelerate core serialization (#2924)
- Refine lifecycle api to support incref or decref with ref counts (#2926)
- Ignore fetch operands when assign initial nodes (#2929)
- Use cython to accelerate message serialization (#2932)
- Ignore broadcaster's locality when assign subtasks (#2943)
- Allow spawning serialization to threads for large objects (#2944)
- Add metrics and event report for Ray channels (#2936)
- Add more logs about execution info (#2940)
- Add support for
dask.persist(#2953, thanks @loopyme!) - Remove
should_be_monotonicproperty (#2949) - Add metrics on operand and subtask executions (#2947, thanks @zhongchun!)
- [Ray] optimize ray fetcher by query in remote node (#2957)
- Improve deploy backend (#2958)
- Support reporting tile progress (#2954)
- Add logic key for tileable graph (#2961, thanks @zhongchun!)
- [Ray] Loads the subtask inputs from meta (#2976)
- New ExecutionConfig API (#2968)
- Fix speculative execution compatibility with coloring (#2995)
- Make functions that may take long run in thread for lifecycle tracker (#2992)
- Optimize metric configs (#2996, thanks @zhongchun!)
- Expand the ability of resource evaluator (#2997, thanks @zhongchun!)
- Optimize gen subtask graph (#3004)
- [Ray] Ray execution state (#3002)
Bug fixes
- Fix paramter issue of worker actor pool (#2911, thanks @zhongchun!)
- Fix default config to ensure storage backends configured (#2935)
- Wrap errors in operand execution to protect scheduling service (#2964)
- Fix dtype of series result for
DataFrame.apply(#2978) - Fix potential data leak for shuffle tasks (#2975)
- Fix potential empty chunks when creating DataFrame from pandas (#2987)
- [Ray] Support new ray cluster through ray client (#2981)
- Fix missing extra_params when constructing operands (#2999)
- Fix
msg_to_simple_strin Ray backend and add tests (#3003) - Fix incorrect result for
df.sort_valueswhen specifying multiple ascending (#2984)
Documentation
- Add development documents for metrics (#2955, thanks @zhongchun!)
Tests
- Add TPC-H benchmarks (#2937)
- Fix Ray cases (#2983)
- Fix version mismatch between kubernetes and minikube (#2986)
- Allow selecting TPC queries (#3005)
- Python
Published by qinxuye about 4 years ago
https://github.com/mars-project/mars - v0.8.5
This is the release notes of v0.8.5. See here for the complete list of solved issues and merged PRs.
New Features
- Web
- Add stack display page on Mars Web (#2881)
Enhancements
- Avoid printing too many messages in Oscar (#2880)
- [Ray] Use main pool as owner when autoscale disabled (#2903)
Bug fixes
- Fix XGBoost when some workers do not have
evalsdata (#2863) - Raise ActorNotExist when no supervisors available (#2869)
- Fix dtype infer in DataFrame arithmetic on datetime consts (#2880)
- Fix duplicate node iteration in GraphAssigner (#2880)
- Fix timeout for
wait_task(#2890) - Make sure errors can be raised in
Actor.__pre_destroy__(#2892)
Tests
- Upgrade azure-pipelines to Python 3.9 (#2886)
- Adapt to official cancel of Github Actions (#2903)
- Python
Published by qinxuye about 4 years ago
https://github.com/mars-project/mars - v0.9.0rc2
This is the release notes of v0.9.0rc2. See here for the complete list of solved issues and merged PRs.
New Features
- Web
- Add stack display page on Mars Web (#2876)
Enhancements
- Avoid printing too many messages in Oscar (#2871)
- Expand slot scheduler to resource scheduler (#2846, thanks @zhongchun!)
- Optimized iterative tiling by pruning unrelated chunks (#2874)
- Optimize
DataFrameIsin's tile (#2864) - Add benchmark for serialization (#2901)
- [Ray] Ray client channel get recv when first complied (#2740, thanks @Catch-Bull!)
- Use bloom filter to optimize df.merge execution (#2895)
- Stop recording all mapper meta (#2900)
- [Ray] Use main pool as owner when autoscale disabled (#2878)
Bug fixes
- Fix XGBoost when some workers do not have
evalsdata (#2861) - Fix duplicate node iteration in GraphAssigner (#2857)
- Raise ActorNotExist when no supervisors available (#2859)
- Fix dtype infer in DataFrame arithmetic on datetime consts (#2879)
- Fix timeout for
wait_task(#2883) - Make sure error can be raised in
Actor.__pre_destroy__(#2887)
Tests
- Upgrade azure-pipelines to Python 3.9 (#2862)
- Adapt to official cancel of Github Actions (#2902)
- Python
Published by qinxuye about 4 years ago
https://github.com/mars-project/mars - v0.9.0rc1
This is the release notes of v0.9.0rc1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implements
mars.tensor.setdiff1d(#2823)
- Implements
- Learn
- Added support for
mars.learn.metrics.roc_auc_score(#2832)
- Added support for
- Services
- A speculative execution based task scheduler (#2576)
- Metric
- [ray] Add metric for ray object store (#2776, thanks @Catch-Bull!)
- Others
- Use versioneer to manage release versions (#2806)
Enhancements
- Support generating a DOT file for subtask graph (#2803)
- Support generating dtypes, index_value etc lazily for DataFrame chunks (#2756)
- [ray] Default enable fault tolerance for ray (#2801)
- Improve subtask details in logs (#2836)
- Accurate resource management for global slot manager (#2732)
- Configure nthread of XGBoost jobs (#2844)
- Improved performance of
mars.learn.metrics.{roc_curve, roc_auc_score}(#2838) - Bump minimist and nanoid in Mars UI due to security alerts (#2849)
- Fix store duplicate chunk and meta per subtask (#2845)
Bug fixes
- Fix default value of
gpuproperty for some operands (#2811) - Fixes the failure on Vineyard CI by ensure the input tensor chunk is a numpy's ndarray (#2817)
- Fix race condition of
set_subtask_result(#2784) - Fix duplicate subtask submit (#2815)
- Change
StorageHandlerActorto stateful (#2824) - Fix running xgboost on Ray cluster (#2826)
- Fix
FileSystem.lsfor OSS (#2837) - Stop fetching data when pure dependencies specified (#2840)
- Fix dirty version number caused by versioneer when building with cibuildwheel (#2855)
Tests
- [Ray] Refine ray tests (#2793)
- Build docker images cronically (#2804)
- Introduce asv benchmark (#2798)
- Python
Published by wjsi about 4 years ago
https://github.com/mars-project/mars - v0.8.4
This is the release notes of v0.8.4. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implements
mars.tensor.setdiff1d(#2829)
- Implements
- Learn
- Added support for
mars.learn.metrics.roc_auc_score(#2841)
- Added support for
- Others
- Use versioneer to manage release versions (#2807)
- Use cibuildwheel to release wheels (#2854)
Enhancements
- Support generating a DOT file for subtask graph (#2818)
- Enhance subtask details in logs (#2842)
- Configure cores of XGBoost jobs (#2847)
- Improved performance of
mars.learn.metrics.{roc_curve, roc_auc_score}(#2850) - Fix store duplicate chunk and meta per subtask (#2851)
- Bump minimist and nanoid in Mars UI due to security alerts (#2851)
Bug fixes
- Fix race condition of setsubtaskresult (#2819)
- Fix duplicate subtask submit (#2819)
- Fixes the failure on Vineyard CI by ensure the input tensor chunk is a numpy's ndarray (#2819)
- Fix default value of gpu property for some operands (#2820)
- Fix running xgboost on Ray cluster (#2830)
- Change StorageHandlerActor to stateful (#2830)
- Fix
FileSystem.lsfor OSS (#2842) - Stop fetching data when pure dependencies specified (#2843)
Tests
- [Ray] Refine ray tests (#2810)
- Build docker images cronically (#2807)
- Python
Published by wjsi about 4 years ago
https://github.com/mars-project/mars - v0.8.3
This is the release notes of v0.8.3. See here for the complete list of solved issues and merged PRs.
Enhancements
- Stop inferring outputs when args provided (#2761)
- Remove deprecate warnings when import mars.tensor (#2790)
- [Ray] New ray actor creation model (#2794)
Bug fixes
- Fix long exception of asyncio.gather (#2753)
- Fix wrong result of
df.merge(#2777) - Fix DataFrame initializer when Mars object exists in list (#2778)
- Fix duplicate dec object ref (#2789, thanks @Catch-Bull!)
- [Ray] Support Ray client mode (#2796)
Tests
- Increase test stability for command-line tests (#2786)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.9.0b2
This is the release notes of v0.9.0b2. See here for the complete list of solved issues and merged PRs.
New Features
- Metric
- Add metric framework (#2742, thanks @zhongchun!)
- Add prometheus metric implementation (#2752, thanks @zhongchun!)
- Add ray metrics implementation (#2749, thanks @zhongchun!)
- Add common metrics (#2760, thanks @zhongchun!)
Enhancements
- Simplify rechunk implementation (#2745)
- Stop inferring outputs when args provided (#2759)
- Add broadcast merge support for DataFrame (#2772)
- Remove deprecate warnings when import mars.tensor (#2788)
- Optimize in-process actor calls (#2763)
- [ray] New ray actor creation model (#2783)
Bug fixes
- Fix duplicate dec object ref (#2741, thanks @Catch-Bull!)
- Fix long exception of asyncio.gather (#2748)
- Fix NameError: name 'pq' is not defined if pyarrow is not installed (#2751)
- Fix profiling bandsubtasks and mostcalls are empty if the slow duration is large (#2755)
- Fix the wrong result of df.merge (#2774)
- Fix DataFrame initializer when Mars object exists in list (#2770)
- [ray] support ray client mode (#2773)
Tests
- Increase test stability for command-line tests (#2779)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.8.2
This is the release notes of v0.8.2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Support
inclusiveargument forpd.date_range(#2721)
- Support
Enhancements
- Optimize eval-setitem expressions as single eval expressions (#2699)
- [Ray] Refine raydataset integration (#2712)
- [Ray] refine ray dataset integration (#2726)
- Add support for reading partitioned parquet for fastparquet (#2729)
- Fix duplicate exceptions in log (#2736)
Bug fixes
- Fix
sort_valuesfor empty DataFrame or Series (#2686) - Eliminate redundant eval node in optimization (#2688)
- Avoid iterative tiling for
df.loc[:, fields](#2689) - Fix
use_arrow_dtypeparameter forread_parquet(#2702) - Fix error on dependent DataFrame setitems (#2703)
- Fix
estimate_pandas_sizeonpd.MultiIndex(#2710) - Import vineyard.data.pickle to make members available (#2716)
- Fix shuffle when ndim of input tensors are different (#2728)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.9.0b1
This is the release notes of v0.9.0b1. See here for the complete list of solved issues and merged PRs.
Highlights
- A new coloring-based fusion algorithm is introduced in #2719, performance is expected to have a significant increase compared to previous releases, however, some unexpected situations may happen, feel free to reach out to us if you find any.
New Features
- DataFrame
- Support
inclusiveargument forpd.date_range(#2718)
- Support
- Others
- Add cibuildwheel with Linux AArch64 wheel build support (#2672, thanks @odidev!)
Enhancements
- Refine failure recovery log and exception (#2633)
- Optimize eval-setitem expressions as single eval expressions (#2695)
- Auto merge small chunks when
df.groupby().apply(func)is doing aggregation (#2708) - Optimize GroupBy's aggregation algorithm (#2696)
- [Ray] refine ray dataset integration (#2705)
- Improve profiling (#2629)
- Add support for reading partitioned parquet for fastparquet (#2724)
- Introduce coloring based fusion algorithm (#2719)
- Fix duplicate exceptions in log (#2723)
Bug fixes
- Fix
sort_valuesfor empty DataFrame or Series (#2681) - Eliminate redundant eval node in optimization (#2683)
- Avoid iterative tiling for
df.loc[:, fields](#2685) - [hotfix][ray] fix ray dataset compatibility (#2693)
- Fix
use_arrow_dtypeparameter forread_parquet(#2698) - Fix error on dependent DataFrame setitems (#2701)
- Fix
estimate_pandas_sizeforpd.MultiIndex(#2707) - Import vineyard.data.pickle to make members available. (#2714)
- Fix shuffle when ndim of input tensors are different (#2727)
Documentation
- Add Slack invite link (#2704, thanks @yuyiming!)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.8.1
This is the release notes of v0.8.1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add support for GroupBy.{ffill, bfill,fillna} (#2657, thanks @Marascax!)
- Add
nuniquesupport for DataFrameGroupBy (#2667)
Enhancements
- Add support for HTTP request rewriter (#2665)
- Add merging small files support for
md.{read_parquet, read_csv}(#2669) - Optimize filtering DataFrame with its fields (#2668)
Bug fixes
- Allow specifying multiple supervisor processes (#2625)
- Fix backward compatibility for pandas 1.0 (#2630)
- Fix
NotImplementedErrorformo.batchwhen single call not implemented (#2637) - Fix compatibility for pandas 1.4 (#2652)
- Fix
IndexErrorraise by aggregation of DataFrameGroupBy (#2653) - Fix df.loc[:] to make sure same index_value key generated (#2654)
- Fix aggregation with comparison (#2655)
- Fix the wrong index_value generated by df.loc:
- Fix
as_indexwhen calling groupby-agg (#2678)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.9.0a2
This is the release notes of v0.9.0a2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add support for GroupBy.{ffill, bfill,fillna} (#2639, thanks @Marascax!)
- Add
nuniquesupport for DataFrameGroupBy (#2662)
- Others
- Add wheel support for Python 3.10 and drop Python 3.6 (#2622)
Enhancements
- Added merging small files support for
md.{read_parquet, read_csv}(#2661) - Add support for HTTP request rewriter (#2664)
- Optimize filtering DataFrame with its fields (#2571)
- Add pyproject.toml to config build packages (#2674)
Bug fixes
- Fix backward compatibility for pandas 1.1 and 1.2 (#2624)
- Fix backward compatibility for pandas 1.0 (#2628)
- Fix
NotImplementedErrorformo.batchwhen single call not implemented (#2635) - Fix
IndexErrorraise by aggregation of DataFrameGroupBy (#2641) - Fix compatibility for pandas 1.4 (#2650)
- Fix df.loc[:] to make sure same index_value key generated (#2643)
- Fix aggregation with comparison (#2647)
- Fix the wrong index_value generated by df.loc:
- Fix optimizing DataFrame query with timestamp in conditions (#2671)
- Fix
as_indexwhen calling agg on SeriesGroupBy (#2676)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.8.0
This is the release notes of v0.8.0. See here for the complete list of solved issues and merged PRs.
This release note only covers the difference from v0.8.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:
alpha1 alpha2 alpha3 beta1 beta2 rc1
New Features
- Tensor
- Implements
mt.bincount(#2552)
- Implements
- DataFrame
- Support
Series.median(#2570, thanks @perfumescent!)
- Support
- Learn
- Add
mars.learn.metrics.multilabel_confusion_matrixand derivative metrics (#2568)
- Add
Enhancements
- Implement web API of
get_infos(#2564) - Reduce time cost of cpu_percent() calls (#2572)
- Stop calling user funcs when dtypes is specified (#2596)
- Supports adding Mars extensions via setup entrypoints (#2598)
- [Ray] Refine mars on ray usability (#2606)
- Reduce estimation time cost (#2607)
- Skip details of shuffled chunks in meta (#2609)
- Reduce the time cost of fetching tileable data (#2616)
- Reduce RPC cost of oscar by removing unnecessary tasks (#2613)
- Use batched request to apply for slots (#2615)
Bug fixes
- Fix index series.apply when result index unchanged (#2563)
- Fix DataFrame getitem when exists duplicate columns (#2582)
- Upgrade required version of vineyard (#2593)
- Fix progress always is 0 or 100% (#2595)
- Fix None dtype for some unary tensor functions (#2604)
- Make Proxima work with latest Mars (#2605, thanks @yuyiming!)
- Fix tests for cudf 21.10 (#2608)
- Fix duplicate decref of subtask input chunk (#2614, thanks @Catch-Bull!)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.9.0a1
This is the release notes of v0.9.0a1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implements
mt.bincount(#2548)
- Implements
- DataFrame
- Support Series.median() (#2566, thanks @perfumescent!)
- Learn
- Add
mars.learn.metrics.multilabel_confusion_matrixand derivative metrics (#2554)
- Add
- Services
- Add basic profiling support for supervisor (#2586)
Enhancements
- Add appqueue in newcluster (#2550, thanks @xxxxsk!)
- Implement web API of
get_infos(#2558) - Reduce time cost of
cpu_percent()calls (#2567) - Reduce estimation time cost (#2577)
- [ray] refine mars on ray usability (#2580)
- [ray] Refine raydataset integration (#2579)
- Optimize tileable graph construction (#2583)
- Stop calling user funcs when dtypes is specified (#2587)
- Supports adding Mars extensions via setup entrypoints (#2589)
- Skip details of shuffled chunks in meta (#2600)
- Reduce the time cost of fetching tileable data (#2594)
- Use batched request to apply for slots (#2601)
- Reduce RPC cost of oscar by removing unnecessary tasks (#2597)
Bug fixes
- Fix index
series.applywhen result index unchanged (#2557) - Stop using asdict to handle dataclasses (#2561)
- Fix tests under cudf 21.10 (#2608)
- Fix DataFrame getitem when exists duplicate columns (#2581)
- Upgrade required version of vineyard. (#2588)
- Fix progress always is 0 or 100% (#2591)
- Make Proxima work with latest Mars (#2599, thanks @yuyiming!)
- Fix None dtype for some unary tensor functions (#2603)
- Fix duplicate decref of subtask input chunk (#2611, thanks @Catch-Bull!)
Documentation
- Add a document about how to implement a Mars operand (#2562)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.7.5
This is the release notes of v0.7.5. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Add preliminary implementations for ufunc methods (#2513)
- Add partial support for setitem with fancy indexing (#2544)
- DataFrame
- Implements
md.get_dummies(#2534, thanks @hoarjour!)
- Implements
- Learn
- Add
make_regressionsupport for learn module (#2517) - Implements
mars.learn.preprocessor.LabelEncoder(#2545)
- Add
- Services
- Add web API for scheduling (#2535)
- Web
- Display tileable properties on web (#2539, thanks @RandomY-2!)
- Others
- Add experimental support for CUDA under WSL for Windows 11 (#2543)
Enhancements
- Reduce indentation of frontend code (#2541)
Bug fixes
- Fix output of
df.groupby(as_index=False).size()(#2508) - Fix reduction result on empty series (#2522)
- Fix
df.locwhen df is empty (#2526) - [Ray] Fix serializing lambdas in web (#2529)
- Fix
df.locwhen providing empty list (#2532)
Documentation
- Add doc for reading csv in oss (#2530, thanks @Catch-Bull!)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.8.0rc1
This is the release notes of v0.8.0rc1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Add preliminary implementations for ufunc methods (#2510)
- Add partial support for setitem with fancy indexing (#2453)
- DataFrame
- Support
md.get_dummies()(#2323, thanks @hoarjour!)
- Support
- Learn
- Add
make_regressionsupport for learn module (#2515) - Implements fit and predict methods for bagging (#2516)
- Implements
mars.learn.ensemble.IsolationForest(#2531) - Implements
mars.learn.preprocessor.LabelEncoder(#2542)
- Add
- Services
- Add web API for scheduling (#2533)
- Web
- Display tileable properties on web (#2525, thanks @RandomY-2!)
- Others
- Support mutable tensor on oscar (#2432, thanks @Coco58323!)
- Add experimental support for CUDA under WSL for Windows 11 (#2538)
Enhancements
- Use black to enforce code style (#2492)
- Reduce indentation of frontend code (#2540)
Bug fixes
- Fix output of
df.groupby(as_index=False).size()(#2507) - [Ray] Fix web serialize lambda (#2512)
- Fix reduction result on empty series (#2520)
- Fix
DataFrame.locwhen df is empty (#2524) - Fix
df.locwhen providing empty list (#2528)
Documentation
- Add doc for reading csv in oss (#2514, thanks @Catch-Bull!)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.7.4
This is the release notes of v0.7.4. See here for the complete list of solved issues and merged PRs.
New Features
- Web
- Split tileable information and subtask graph into two tabs (#2482, thanks @RandomY-2!)
- Include tileable property in detail api (#2499, thanks @RandomY-2!)
Enhancements
- Support specified vineyard socket and skip the launching vineyardd process (#2500)
- Refine MarsDMatrix & support more parameters for XGB classifier and regressor (#2501)
Bug fixes
- Compatible with scikit-learn 1.0 (#2487)
- Fix bug that failed to execute query when there are multiple arguments (#2491, thanks @perfumescent!)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.8.0b2
This is the release notes of v0.8.0b2. See here for the complete list of solved issues and merged PRs.
New Features
- Learn
- Implements
glm.LogisticRegression(#2466, thanks @Fernadoo!) - Implements bagging sampling (#2496)
- Implements
- Services
- Basic reschedule subtask (#2467)
- Web
- Split tileable information and subtask graph into two tabs (#2480, thanks @RandomY-2!)
- Include tileable property in detail api (#2493, thanks @RandomY-2!)
Enhancements
- Support specified vineyard socket and skip the launching vineyardd process (#2481)
- Refine MarsDMatrix & support more parameters for XGB classifier and regressor (#2498)
Bug fixes
- Compatible with scikit-learn 1.0 (#2486)
- Fix bug that failed to execute query when there are multiple arguments (#2490, thanks @perfumescent!)
Documentation
- Fix wrong translation in cluster deployment. (#2489, thanks @perfumescent!)
Tests
- Fix version of statsmodels to pass CI (#2497)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.7.3
This is the release notes of v0.7.3. See here for the complete list of solved issues and merged PRs.
New Features
- Learn
- Add
_binary_roc_auc_scoremethod (#2477, thanks @Divyanshu-Singh-Chauhan!)
- Add
- Web
- Support visualizing subtask graphs on Mars Web (#2471, thanks @RandomY-2!)
- Others
- Revisit
{from,to}_vineyardfor tensors and dataframes (#2436) - Add nightly builds for docker images (#2462)
- Make cmdline support third party modules (#2472)
- Revisit
Bug fixes
- Fix
df/series.{apply, map_chunk}when some chunk output empty data (#2437) - Fix missing DAGs when initializing with empty API results (#2442, thanks @RandomY-2!)
- Fix
skewandkurterrors under MacOS (#2445) - Fix usage of kubernetes image (#2448)
- Fix timeout error when waiting for a submitted task (#2461)
- Fix misuse of
nameparameter in DataFrame align (#2473, thanks @qxzhou1010!) - Fix hang when start sub pool fails (#2476)
Tests
- Fix coverage for Azure pipeline (#2475)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.8.0b1
This is the release notes of v0.8.0b1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Integrate Mars DataFrame with Ray Dataset (#2393, thanks @vcfgv!)
- Learn
- Add
_binary_roc_auc_scoremethod (#2403, thanks @Divyanshu-Singh-Chauhan!)
- Add
- Web
- Support visualizing subtask graphs on Mars Web (#2426, thanks @RandomY-2!)
- Others
- Revisit
{from,to}_vineyardfor tensors and dataframes. (#2419) - [Ray] Reconstruct worker (#2413)
- Make cmdline support third party modules (#2454)
- Add nightly builds for docker images (#2456)
- Revisit
Enhancements
- Refine and unify subtask detail APIs (#2465, thanks @RandomY-2!)
Bug fixes
- Fix
df/series.{apply, map_chunk}when some chunk output empty data (#2434) - Fix missing DAGs when initializing with empty API results (#2439, thanks @RandomY-2!)
- Fix
skewandkurterrors under MacOS (#2443) - Add tests for public kubernetes image (#2446)
- Fix timeout error when waiting for a submitted task (#2457)
- Print the error message when error happens in TaskProcessor (#2458)
- Fix misuse of
nameparameter in DataFrame align (#2469, thanks @qxzhou1010!) - Fix hang when start sub pool fails (#2468)
Installation
- Build and upload docker images in continuous deployment (#244)
Tests
- Fix coverage for Azure pipeline (#2474)
- Python
Published by qinxuye over 4 years ago
https://github.com/mars-project/mars - v0.7.2post1
This release is a hotfix of v0.7.2 in order to fix the public docker image.
Bug fixes
- Fix usage of kubernetes image for v0.7.2 (#2447)
- Python
Published by wjsi almost 5 years ago
https://github.com/mars-project/mars - v0.7.2
This is the release notes of v0.7.2. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implements hypergeometric functions (#2408, thanks @Alfa-Shashank!)
- Implements mars.tensor.append (#2422)
- DataFrame
- Implements
Series.between(#2382, thanks @gowrijsuria!) - Implements
DataFrame.transpose()(#2423, thanks @hoarjour!)
- Implements
- Learn
- Add
mars.learn.ensemble.{BlockwiseVotingClassifier, BlockwiseVotingRegressor}(#2391) - Add TensorFlow dataset (#2409, thanks @yuanchongtt!)
- Implements linear_model.LinearRegression (#2411, thanks @Fernadoo!)
- Implements
mars.learn.preprocessing.{LabelBinarizer, label_binarize}(#2421) - Implements
mars.learn.metrics.log_loss(#2424) - Implements mars.learn.wrappers.ParallelPostFit (#2427)
- Add
- Web
- API for subtask DAG structure (#2410, thanks @RandomY-2!)
Bug fixes
- Fix raising wrong error for an operand when post_execute implemented and error occurs in
execute(#2396)
Tests
- Improve case stability (#2387)
- Change all tests to use relative import (#2412)
- Python
Published by qinxuye almost 5 years ago
https://github.com/mars-project/mars - v0.8.0a3
This is the release notes of v0.8.0a3. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implemented hypergeometric functions (#2397, thanks @Alfa-Shashank!)
- Implements
mars.tensor.append(#2417)
- DataFrame
- Implements
Series.between(#2368, thanks @gowrijsuria!) - Integrate Mars DataFrame with Ray MLDataset (#2294, thanks @vcfgv!)
- Support
DataFrame.transpose()(#2327, thanks @hoarjour!)
- Implements
- Learn
- Add
mars.learn.ensemble.{BlockwiseVotingClassifier, BlockwiseVotingRegressor}(#2390) - Implements
linear_model.LinearRegression(#2260, thanks @Fernadoo!) - Add TensorFlow dataset (#2383, thanks @yuanchongtt!)
- Implements
mars.learn.preprocessing.{LabelBinarizer,label_binarize}(#2415) - Implements
mars.learn.metrics.log_loss(#2418) - Implements
mars.learn.wrappers.ParallelPostFit(#2425)
- Add
- Services
- Initially support auto scaling (#2210)
- Web
- API for subtask DAG structure (#2389, thanks @RandomY-2!)
Bug fixes
- Fix raising wrong error for an operand when post_execute implemented and error occurs in
execute(#2395) - [Ray] Fix occasionally failed unittest
test_ownership_when_scale_in(#2401) - [Oscar] Fix possible ActorCaller.call hang (#2404, thanks @Catch-Bull!)
Documentation
- Highlight dask-on-mars in doc (#2399)
Tests
- Improve case stability (#2381)
- Upgrade vineyard to v0.2.7 (#2193)
- Add checks for file mode changes and absolute imports (#2398)
- [Ray] Fix ray version (#2406)
- Change all tests to use relative import (#2407)
- Python
Published by qinxuye almost 5 years ago
https://github.com/mars-project/mars - v0.7.1
This is the release notes of v0.7.1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Support
md.to_numeric(#2334, thanks @hoarjour!) - Gives an error when input DataFrame has unknown dtypes (#2359)
- Implements
DataFrame.assign(#2369, thanks @hxri!) - Support reading csv file from oss (#2374, thanks @zebivy!)
- Support
- Tensor
- Implements
mars.tensor.stats.ks_2samp(#2332) - Implements
mt.stats.ks_1samp(#2341)
- Implements
- Learn
- Support PyTorch Dataset for oscar (#2364, thanks @yuanchongtt!)
- Add KFold support (#2365)
- Services
- Add API to retrieve progress and status of tileables (#2358)
- Web
- Add visualization page for tileable graphs (#2319, thanks @RandomY-2!)
- Add storage infos in web (#2333)
- Display tileable progress, status and dependency link type on task detail page (#2377, thanks @RandomY-2!)
Enhancements
- Support setting multiple columns in DataFrame (#2313)
- Create service classes to manage service and session operations (#2331)
- Remove bokeh from package requirements (#2344)
- Optimize scheduling service on supervisors (#2347)
- Improve waitactorpool_recovered (#2350, thanks @keyile!)
Bug fixes
- Fix the error when multiple subtasks fetch the same data (#2340)
- Fix KeyError when remote function returns None (#2375)
- Fix DataFrame comparison when data type is period (#2376)
Documentation
- Fix untranslated strings in doc (#2349)
- Python
Published by qinxuye almost 5 years ago
https://github.com/mars-project/mars - v0.8.0a2
This is the release notes of v0.8.0a2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Support initializing Mars objects from CUDA (#2308)
- Support
md.to_numeric(#2290, thanks @hoarjour!) - Gives an error when input DataFrame has unknown dtypes (#2355)
- Added assign to DataFrame (#2362, thanks @hxri!)
- Support reading csv file from oss (#2292, thanks @zebivy!)
- Tensor
- Implements
mars.tensor.stats.ks_2samp(#2324) - Implements
mars.tensor.stats.ks_1samp(#2335)
- Implements
- Learn
- Support PyTorch Dataset for oscar (#2246, thanks @yuanchongtt!)
- Add KFold support (#2363)
- Services
- Add API to retrieve progress and status of tileables (#2357)
- Web
- Add visualization page for tileable graphs (#2282, thanks @RandomY-2!)
- Add storage infos in web (#2317)
- Display tileable progress, status and dependency link type on task detail page (#2360, thanks @RandomY-2!)
- Others
- [Ray] Rerun subtask for ray backend (#2288, thanks @keyile!)
- Add experimental Dask-on-Mars support (#2289, thanks @loopyme!)
Enhancements
- Support setting multiple columns in DataFrame (#2303)
- Refactor tileable visualization classes (#2318)
- Create service classes to manage service and session operations (#2326)
- Improve waitactorpool_recovered (#2328, thanks @keyile!)
- Remove bokeh from package requirements (#2339)
- Optimize mars supervisor scheduling (#2325)
Bug fixes
- Fix hangs when worker main pool has failures. (#2286)
- Fix the error when multiple subtasks fetch the same data (#2322)
- [Ray] Fix ray ci (#2343, thanks @keyile!)
- Fix error in Dask-on-Mars when compute multiple objects (#2348, thanks @loopyme!)
- Fix KeyError when remote function returns None (#2371)
- Fix DataFrame comparison when data type is period (#2373)
Documentation
- Fix untranslated strings in doc (#2346)
- Fix docs of
DataFrame.assign(#2367)
- Python
Published by qinxuye almost 5 years ago
https://github.com/mars-project/mars - v0.7.0
This is the release notes of v0.7.0. See here for the complete list of solved issues and merged PRs.
This release note only covers the difference from v0.7.0rc2; for all highlights and changes, please refer to the release notes of the pre-releases:
alpha1 alpha2 alpha3 alpha4 alpha5 alpha6 alpha7 alpha8 beta1 beta2 rc1 rc2
Changes that break compatibility
v0.7.0 has unified local and distributed execution layer, local thread-based scheduling has been removed, instead, the unified runtime is based on multiprocess-based scheduling which could get rid of infamous GIL problem .
Thus, for local usage, please new a local default session via:
```python import mars
mars.new_session() # create a default local session ```
If not doing so, it will be initialized once in the background, however, keep in mind that the initialization of multiprocess scheduling consumes more time compared to multithread one.
We tried our best to keep other compatibilities, if you find any incompatible place, please open an issue to reach out to us.
Highlights
v0.7.0 implements a unified execution layer, all deployment including bare metal, Kubernetes, Ray as well as Yarn shares the same fundamental components. This unified execution layer optimized many aspects compare to the old one including:
- Better serialization based on pickle5 protocol, which is 5-7x faster than old version.
- Completely rewritten execution layer which has better performance, even 20%-50% faster than the old version on a laptop.
- Based on multiprocess scheduling which avoids infamous GIL issue.
- Mars on Ray is way more better due to the reason that Ray actor is leveraged to build the Ray backend of Oscar which is a lightweight actor framework that is the fundamental part of the entire execution layer.
- GPU can be supported more better with the new architecture.
New Features
- Tensor
- Add partial support of bessel functions (#2274, thanks @JuntaoMa!)
- Implements mars.tensor.in1d (#2301)
- Learn
- Implements mars.learn.utils.multiclass.unique_label (#2300)
- Services
- Add getstoragelevel_info api (#2242)
- Add API to fetch tileable graph as JSON (#2271, thanks @RandomY-2!)
- Enable running on GPU for oscar (#2306)
- Others
- Add support for seek method in memory cases (#2264)
Enhancements
- Add support for stateless actors (#2220)
- Add status filters for Cluster service (#2221)
- Pass logging config file name into sub pools (#2225)
- Support choosing aggregation algorithm at runtime (#2226)
- Add method to session to get web endpoint (#2238)
- Use Kubernetes Service to discover Mars Supervisors (#2240)
- Ensure range index incremental for data source op like
md.read_csv(#2244) - Record mapper meta for shuffle task (#2255)
- Support data dependency for run_script (#2256)
- Refine oscar debugging (#2261)
- Support fetch_log for web session (#2262)
- Allow turning off actor killing (#2277)
- Use batch method to reduce transferring cost for shuffle tasks (#2279)
- Assign bands given devices of subtasks (#2278)
- Add bind method to facilitate extracting batch args (#2281)
- Reduce memory estimation for specific operands (#2285)
Bug fixes
- Fix NoDataToSpill when multiple storage quota requests happen simultaneously (#2223)
- Stop using thread local to store default session (#2243)
- Fix service errors in Windows (#2247)
Documentation
- Doc refinement for Oscar (#2291)
- Add docs for batch methods (#2298)
Installation
- Merge default & distributed requirements (#2270)
Tests
- Add separate check pipeline (#2302)
- Python
Published by qinxuye almost 5 years ago
https://github.com/mars-project/mars - v0.8.0a1
This is the release notes of v0.8.0a1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Add partial support of bessel functions (#2258, thanks @JuntaoMa!)
- Implements mars.tensor.in1d (#2297)
- Learn
- Implements mars.learn.utils.multiclass.unique_label (#2295)
- Services
- Add getstoragelevel_info api (#2228)
- Basic rerun subtask (#2198)
- Add API to fetch tileable graph as JSON (#2253, thanks @RandomY-2!)
- Enable running on GPU for oscar (#2284)
- Others
- Add support for seek method in memory cases (#2250)
Enhancements
- Support choosing aggregation algorithm at runtime (#2213)
- Add support for stateless actors (#2218)
- Add status filters for Cluster service (#2214)
- Reassign subtasks and filter nodes with status (#2159, thanks @vcfgv!)
- Add methods to sessions to get web endpoint (#2236)
- Ensure range index incremental for data source op like
md.read_csv(#2232) - Use Kubernetes Service to discover Mars Supervisors (#2227)
- Record mapper meta for shuffle task (#2248)
- Support data dependency for
run_script(#2251) - Refine oscar debugging (#2252)
- Support fetch_log for web session (#2257)
- Use batch method to reduce transferring cost for shuffle tasks (#2233)
- Allow turning off actor killing (#2273)
- Assign bands given devices of subtasks (#2276)
- Add bind method to facilitate extracting batch args (#2280)
- Reduce memory estimation for specific operands (#2283)
Bug fixes
- Fix
NoDataToSpillwhen multiple storage quota requests happen simultaneously (#2203) - Pass logging config file name into sub pools (#2222)
- Stop using thread local to store default session. (#2217)
- Fix possible CI failure when destroying remote object for incremental index (#2239)
- Fix service errors in Windows (#2237)
Documentation
- Doc refinement for Oscar (#2234)
- Add docs for batch methods (#2293)
Installation
- Merge default & distributed requirements (#2263)
Tests
- Add separate check pipeline (#2299)
- Fix delocate version to 0.8.2 to avoid deploy error (#2305)
- Python
Published by wjsi almost 5 years ago
https://github.com/mars-project/mars - v0.6.11
This is the release notes of v0.6.11. See here for the complete list of solved issues and merged PRs.
Bug fixes
- Fix unexpected NaNs in groupby-agg (#2178)
- Fix groupby on indexes with duplicate items (#2187)
- Fix compatibility issue for pandas 1.3 (#2202)
- Fix mergeindexvalue when index_values come from multi range indexes (#2208)
- Python
Published by qinxuye almost 5 years ago
https://github.com/mars-project/mars - v0.7.0rc2
This is the release notes of v0.7.0rc2. See here for the complete list of solved issues and merged PRs.
New Features
- Services
- Support setting task progress in context (#2192)
- Web
- Implement essential APIs for Web (#2181)
- Use React framework to rewrite Mars UI (#2135)
- Deploy
- Cluster config support third party modules (#2171)
Enhancements
- Use
aiohttpto handle web requests (#2183) - Add stop methods for all services (#2194)
- Move
fetchmethod fromStorageManagerActortoStorageHandlerActor(#2196) - Isolate client and cluster in a separated event loop and thread (#2168)
Bug fixes
- Fix unexpected NaNs in groupby-agg (#2177)
- Fix groupby on indexes with duplicate items (#2186)
- Fix starting multiple workers in shared file system (#2189)
- Fix import error in master branch (#2190)
- [Ray] ray two way hang-detectable channel (#2170)
- Fix compatibility issue for pandas 1.3 (#2197)
- Fix deserializing task errors on web clients (#2199)
- Make window function under old pandas versions work (#2204)
- Fix concatenating row chunks with MultiIndex (#2205)
- Fix mergeindexvalue when index_values come from multi range indexes (#2207)
- Python
Published by qinxuye almost 5 years ago
https://github.com/mars-project/mars - v0.7.0rc1
This is the release notes of v0.7.0rc1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implement
{DataFrame, Series}.empty(#2155, thanks @Fernadoo!)
- Implement
- Services
- Add spill support for oscar (#2160)
Enhancements
- Migrate YARN support for Oscar (#2152)
- Add version API on cluster (#2153)
- Make sure slot usages updated when some task ends (#2161)
- Add debug options for oscar (#2164)
- Add fault injection subtask processor for tests (#2163)
- Collects available ports before running tensorflow scripts (#2169)
- Support stopping tasks for interactive execution (#2154)
- Add cycle detection for oscar debug mode (#2167)
- Update year of license (#2172)
- Support configuring overriding fields (#2173)
Bug fixes
- Fix race condition when creating actors with IdleLabel strategy in a parallel way (#2147)
- Fix shuffle task on oscar (#2146)
Tests
- Use a setup.cfg to gather configurations (#2150)
- Python
Published by qinxuye almost 5 years ago
https://github.com/mars-project/mars - v0.6.10
This is the release notes of v0.6.10. See here for the complete list of solved issues and merged PRs.
New Features
- Implement
{DataFrame, Series}.empty(#2174, thanks @Fernadoo!)
Bug fixes
- Fix
_get_ports_from_netstathang (#2174)
- Python
Published by qinxuye almost 5 years ago
https://github.com/mars-project/mars - v0.7.0b2
This is the release notes of v0.7.0b2. See here for the complete list of solved issues and merged PRs.
Changes that break compatibility
- From v0.7.0b2 on, staled threading-based scheduler as well as distributed scheduler based on Mars actor 1.0 have been removed, thus clients with older versions are completely incompatible.
Highlights
- Unified scheduling based on Oscar which is Mars actor 2.0 is ready for tests.
New Features
- DataFrame
- Add
add_prefixsupport (#2132, thanks @aeinrw!)
- Add
- Services
- Services web handler and api (#2102)
- Implements lifecycle service (#2117)
- Add initial implementation of scheduling service (#2111)
- Deloy
- Add command line support for Oscar deployment (#2131)
- Ray
- [Ray] ray oscar deploy (#2089)
Enhancements
- Hold data ref in DataManager (#2090)
- Enabling iterative tiling etc support for task service (#2097)
- Enable optimization for task service (#2098)
- Implement
last_idle_timeAPI (#2099) - Integrate mars object check for session (#2103)
- Make Mars pools compatible with Python 3.6 (#2110)
- [ray] optimize ray deploy speed (#2118)
- Implement RESTful web API (#2120)
- [ray] Support supervisor exclusive node option (#2121)
- Add asyncio task timeout debugger (#2127)
- Add transfer support for storage service (#2100)
- [ray] supervisor support sub pool (#2128)
- Configure azure pipelines job timeout (#2139)
- Allow overriding service config files (#2140)
Bug fixes
- Fix distributed
make_blobsand column pruning inread_sql(#2092) - Fix result error when yield after exceptions (#2096)
- Wrap sync method in session so that they will be running in threads (#2109)
- Fix pool cases and shared memory cases in Windows (#2114)
- Fix
_get_ports_from_netstathang (#2116) - Fix unpickle mars config error (#2130)
- Filter pipeline jobs by branch (#2138)
Tests
- Fix coverage result on
SubActorPool(#2095) - [Ray] Support ray subprocess covarage (#2101)
- Run operand tests in Azure Pipelines (#2137)
- Migrate tensor/dataframe/learn tests to oscar (#2106)
- Python
Published by qinxuye about 5 years ago
https://github.com/mars-project/mars - v0.6.9
This is the release notes of v0.6.9. See here for the complete list of solved issues and merged PRs.
New Features
- Add add_prefix support (#2133, thanks @aeinrw!)
Bug fixes
- Fix distributed
make_blobsand column pruning inread_sql(#2093)
Tests
- Remove tokens for codecov on 0.6 (#2141)
- Python
Published by qinxuye about 5 years ago
https://github.com/mars-project/mars - v0.7.0b1
This is the release notes of v0.7.0b1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implements
mt.block(#2069, thanks @fyrestone!)
- Implements
- Project Galois
- Initial task service support (#2084)
Enhancements
- Apply new serialization to operand and Mars objects (#2075)
- Make import of LightGBM and XGBoost lazy (#2083)
- Use chunk index as default shuffle key (#2086)
- Project Galois
- Oscar actor pool on ray (#2063, thanks @chaokunyang!)
Bug fixes
- Use a ref count to make delete works when multiple workers connect to the same vineyardd (#2077)
Tests
- Allow cancelling flows automatically (#2082)
- Python
Published by qinxuye about 5 years ago
https://github.com/mars-project/mars - v0.6.8
This is the release notes of v0.6.8. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implements
mt.block(#2080, thanks @fyrestone!)
- Implements
Enhancements
- Make import of LightGBM and XGBoost lazy (#2085)
Bug fixes
- Fix
mt.uniqueon empty arrays (#2061) - Use a ref count to make delete works when multiple workers connect to the same vineyard (#2078)
- Backport of bug fixes discovered in Galois serialization refactor (#2081)
Tests
- Fix batch get / delete object id and stop launching plasma when working on vineyard (#2071)
- Python
Published by qinxuye about 5 years ago
https://github.com/mars-project/mars - v0.7.0a8
This is the release notes of v0.7.0a8. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implement
mt.insertandmt.delete(#2039)
- Implement
Project Galois
- Oscar
- [Ray backend] setup_cluster use placement group (#2041)
- Fix cancelling actor promises (#2032)
- Fix
__pre_destroy__not called when actors are in main pool (#2045) - [Ray] Fix ray cluster_utils import (#2054)
- Fix main pool that creating servers only when sub pools finished creation (#2064)
- Services
- Initialize meta service with mock support (#2034)
- API definition for services (#2040)
- Allow sync actor methods to be extensible (#2050)
- Implements initial version of cluster service (#2049)
- Enhance meta service (#2062)
- Initial implementation for storage service (#2056)
- Add bands to chunk meta (#2065)
- Enrich usage experience for extensible function (#2067)
- Add API interface to get data info (#2068)
- Serialization
- Support cuda buffer serializations in actor communication (#2031)
- Allow serializer to serialize recursive objects (#2058)
- Implements serializables based on new serialization mechanism (#2051)
Enhancements
- Support reusing kubedl cluster by job name (#2035)
- Quarantine asyncio tests when measuring Cython coverage (#2070)
Bug fixes
- Fix wrong results of
mt.insert(#2046) - Fix for
mt.insertwhen insert values is a mars tensor (#2052) - Fix batch get / delete object id and stop launching plasma when working on vineyard (#2072)
- Python
Published by qinxuye about 5 years ago
https://github.com/mars-project/mars - v0.6.7
This is the release notes of v0.6.7. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implement
mt.insertandmt.delete(#2042)
- Implement
Enhancements
- Support reusing kubedl cluster by job name (#2036)
Bug fixes
- Re-enable the from/to vineyard test cases, and set meta for tensor/dataframe properly(#2030)
- Fix wrong results of
mt.insert(#2048) - Fix for mt.insert when insert values is a mars tensor (#2053)
- Python
Published by qinxuye about 5 years ago
https://github.com/mars-project/mars - v0.7.0a7
This is the release notes of v0.7.0a7. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
{DataFrame, Series}.pct_change(#2014)
- Implements
- Tensor
- Implements tree arithmetic for tensor add and multiplication (#2024)
Project Galois
- Oscar
- Add support for batch interfaces for actors (#2013)
- [oscar] Add cancel support, optimize error handling, add
kill_actorAPI (#2027)
- Service
- Add initial service implementations (#2010)
Enhancements
- Use mmap files to reduce memory usage in proxima builder (#1866)
- Support setting column with different index for DataFrame (#2020)
Bug fixes
- Fix errors when calling where() on reshape results (#2011)
- Fix log error when yielding to another remote (#2022)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.6
This is the release notes of v0.6.6. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
{DataFrame, Series}.pct_change(#2015)
- Implements
- Tensor
- Implements tree arithmetic for tensor add and multiplication (#2028)
Enhancements
- Use mmap files to reduce memory usage in proxima builder (#2016)
- Support setting column with different index for DataFrame (#2025)
Bug fixes
- Fix IndexError in
Series.sort_valueswhen some chunk is empty (#2001) - Fix mars crashes on ray >= 1.2.0 (#2003, thanks @fyrestone!)
- Add
errorsargument forgroupby.sampleto ignore errors when group size less thann(#2007) - Fix errors when calling
where()on reshape results (#2012) - Fix log error when yielding to another remote (#2026)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.7.0a6
This is the release notes of v0.7.0a6. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
Index.__getitem__(#1971) - Implements
{DataFrame,Series}.sample(#1983) - Implements
DataFrameGroupBy.sample(#1994)
- Implements
- Tensor
- Implements
stats.chisquare(#1974) - Implement ttests and gamma functions (#1986)
- Implements
Project Galois
- Oscar
- [oscar] Fix actor promise & add tests (#1958)
- [oscar] Add communication layer for Mars backend (#1989)
- [oscar] Implements Mars backend for oscar (#1996)
- Storage
- [storage][vineyard] Implement storage lib of vineyard backend (#1952, thanks @acezen!)
- [storage][sharedmemory] Add storage backend of `multiprocessing.sharedmemory` (#1969)
- [storage][cuda] Add cuda backend storage implementation (#1981)
- [storage][ray] Implements Ray storage (#1992, thanks @fyrestone!)
Enhancements
- Allow wrapping existing models with Mars class constructors (#1956)
- Optimize performance of
DataFrame.describe()(#1961) - Initialize
filesystemandaiolibs (#1980)
Bug fixes
- Fix
MarsDMatrixwhen input tensor has unknown chunk shape (#1966) - Fix tensor sorting with empty chunks (#1968)
- Re-enable the from/to vineyard test cases, and set meta for tensor/dataframe properly. (#1967)
- Fix ValueError when reducing tensors with empty chunks (#1978)
- Fix job hang when error message can't be pickled (#1990)
- Fix IndexError in
Series.sort_valueswhen some chunk is empty (#1999) - Fix mars crashes on ray >= 1.2.0 (#1998, thanks @fyrestone!)
- Add
errorsargument for groupby.sample to ignore errors when group size less thann(#2002)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.5
This is the release notes of v0.6.5. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
Index.__getitem__(#1975) - Implements
{DataFrame,Series}.sample(#1987) - Implements
DataFrameGroupBy.sample(#1995)
- Implements
- Tensor
- Implements
stats.chisquare(#1976) - Implement ttests and gamma functions (#1988)
- Implements
Enhancements
- Allow wrapping existing models with Mars class constructors (#1957)
- Optimize performance of
DataFrame.describe()(#1962) - Initialize
filesystemlibs (#1982)
Bug fixes
- Fix tensor sorting with empty chunks (#1973)
- Fix
MarsDMatrixwhen input tensor has unknown chunk shape (#1970) - Fix ValueError when reducing tensors with empty chunks (#1979)
- Fix job hang when error message can't be pickled (#1993)
Tests
- Add tests and releases for Python 3.9 (#1955)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.7.0a5
This is the release notes of v0.7.0a5. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
DataFrame.{eval,query}(#1898) - Implements
{DataFrame, Series}.duplicated()(#1907) - Implements is_monotonic properties (#1939)
- Implements
{DataFrame,Series}.set_axis(#1950)
- Implements
Project Galois
- [oscar] Add actor driver & structure adjustment (#1925)
- [oscar][ray backend] Actor creation (#1916, thanks @fyrestone!)
- Add new serializer implementation (#1937)
- Implement storage lib of Arrow plasma as well as disk (#1904)
Enhancements
- Allow set verify_ssl to False for kubernetes configuration (#1911)
- Optimize generating mock DataFrames (#1913)
- Move opcodes out of protobuf definition (#1944)
Bug fixes
- To vineyard: avoid copy when chunks are already in vineyard (vineyard is the backend). (#1899)
- Fix rechunk when input tileable has unknown shape (#1912)
- Fix KeyError when comparing series (#1920)
- Fix rechunk when chunks have different dtypes that cannot compare (#1922)
- Collect available ports before running LightGBM task (#1927)
- Fix KeyError when column pruning is applied (#1929)
- Fix shuffling data in mars.learn module (#1931)
- Fix memory estimation of StartTracker for XGBoost (#1934)
- Fix
accuracy_scorefor distributed execution (#1945)
Tests
- Add tests and releases for Python 3.9 (#1954)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.4
This is the release notes of v0.6.4. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
DataFrame.{eval,query}(#1900) - Implements
{DataFrame, Series}.duplicated()(#1909) - Implements
is_monotonicproperties (#1946) - Implements
{DataFrame,Series}.set_axis(#1951)
- Implements
Enhancements
- Optimize generating mock DataFrames (#1915)
- Move opcodes out of protobuf definition (#1947)
Bug fixes
- Fix rechunk when input tileable has unknown shape (#1914)
- Fix KeyError when comparing series (#1921)
- Fix rechunk when chunks have different dtypes that cannot compare (#1926)
- Collect available ports before running LightGBM task (#1927)
- Fix KeyError when column pruning is applied (#1933)
- Fix error when shuffling data in
mars.learnmodule (#1936) - Fix memory estimation of StartTracker for XGBoost (#1936)
- Fix
accuracy_scorefor distributed execution (#1948)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.3
This is the release notes of v0.6.3. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add more functionalities for
md.Index(#1864) - Implements
{DataFrame,Series}.rename_axis(#1870)
- Add more functionalities for
Enhancements
- Allow internal serialization to use JSON (#1882)
- Optimize performance of
{md.read_csv(), md.read_parquet()}.head()(#1883) - Optimize performance of
df.sort_values().head()(#1888) - Support column pruning for groupby().agg() on data sources (#1889)
- Improve
named_{dataframe, series, tensor}that it's able to get more meta (#1897)
Bug fixes
- Fix wrongly raised error: Tileable object must be executed first before being fetched (#1875)
- Support unknown shape for
mt.reshape,mt.histogramandmd.DataFrame(#1876) - Fix stuck of threaded actor operations in gevent==20.12.0 (#1881)
- Fix sorting string columns with None value & sorting with empty chunks (#1893)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.7.0a4
This is the release notes of v0.7.0a4. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add more functionalities for
md.Index(#1860) - Implements
{DataFrame,Series}.rename_axis(#1867)
- Add more functionalities for
Enhancements
- Allow internal serialization to use JSON (#1880)
- Optimize performance of
{md.read_csv(), md.read_parquet()}.head()(#1878) - Optimize performance of
df.sort_values().head()(#1884) - Support column pruning for
groupby().agg()on data sources (#1886) - Improve
named_{dataframe, series, tensor}that it's able to get more meta (#1896)
Bug fixes
- Support unknown shape for
mt.reshape,mt.histogramandmd.DataFrame(#1869) - Fix wrongly raised error: Tileable object must be executed first before being fetched (#1872)
- Fix reshape when input tensor has unknown shape and 1 chunk (#1874)
- Fix stuck of threaded actor operations in gevent==20.12.0 (#1879)
- Fix sorting string columns with None value & sorting with empty chunks (#1891)
- Adapt
vineyardhandler.pyto latest vineyard. (#1887)
Documentation
- LFAI & Data: Add required documents (#1865)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.2
This is the release notes of v0.6.2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
head()on groupby objects (#1851)
- Implements
- Learn
- Implements
mars.learn.preprocessing.{MinMaxScaler, minmax_scale}(#1858)
- Implements
Enhancements
- Improve Proxima
recall_by_idcomputation method (#1807, thanks @rg070836rg!) - Revise to/from vineyard, of Tensor and DataFrame. (#1806)
- Add memory estimation for
read_parquetas well asread_csv(#1815) - Support using compound agg function in lambda (#1819)
- Add
incremental_indexargument toreset_indexwhich by default is False (#1842) - Support
to_pandasin a batch way for DataFrame and Series (#1859) - Support specifying memory scale in kubernetes (#1861)
Bug fixes
- Fix compatibility for scikit-learn 0.24.0 (#1820)
- Remove unnecessary iterative tiling when predicting via XGBoost and data from/to parquet (#1821)
- Resolve KeyError when calling
delete_keysfor ray backend (#1854) - Fix compatibility for pandas 1.2.0 (#1862)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.7.0a3
This is the release notes of v0.7.0a3. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
head()on groupby objects #1849
- Implements
- Learn
- Implements
mars.learn.preprocessing.{MinMaxScaler, minmax_scale}(#1844)
- Implements
Enhancements
- Improve Proxima
recall_by_idcomputation method (#1805, thanks @rg070836rg!) - Revise to/from vineyard, for Tensor and DataFrame. (#1790)
- Add memory estimation for
read_parquetas well asread_csv(#1811) - Support using compound agg function in lambda (#1810)
- Add
incremental_indexargument toreset_indexwhich by default is False (#1823) - Refine kubedl cluster-api. (#1827, thanks @SimonCqk!)
- Enhancements for Mars on kubedl (#1848)
- Support
to_pandasin a batch way for DataFrame and Series (#1853) - Support specifying memory limit scale in kubernetes (#1856)
- Set marsjob worker cache by
memoryTuningPolicy. (#1857, thanks @SimonCqk!)
Bug fixes
- Fix compatibility for sklearn 0.24.0 (#1817)
- Remove unnecessary iterative tiling when predicting via XGBoost and data from/to parquet (#1818)
- Resolve KeyError when calling
delete_keysfor ray backend (#1846) - Fix compatibility for pandas 1.2.0 (#1847)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.1
This is the release notes of v0.6.1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Support
missingargument fortensor.tosparse()andfill_valueargument forsparse_tensor.todense()(#1802)
- Support
- DataFrame
- Implements
{DataFrame,Series}.replace(#1765) - Add
{DataFrame, Series}.cartesian_chunksupport (#1777) - Integrate
str.catinto reduction and groupby-aggregation (#1781) - Implements reduction with
levelargument (#1784)
- Implements
Bug fixes
- Spawn serialization of executable graphs (#1770)
- Fix getitem on DataFrames with unknown index (#1778)
- Fix reading partitioned parquet files in HDFS (#1783)
- Fix creating Mars Series from empty pandas Series (#1788)
- Fix bug that explicit execute may be required for to_parquet and XGB predict (#1800)
- Support md.concat on DataFrame and Series (#1801)
- Fix TypeError when timeout argument is absent when starting Mars cluster in YARN (#1804, thanks @smartguo!)
Documentation
- Fill docs for apply and transform (#1767)
Tests
- Create different test workflows & fix accessor docs (#1804)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.7.0a2
This is the release notes of v0.7.0a2. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Support
missingargument fortensor.tosparse()andfill_valueargument forsparse_tensor.todense()(#1797)
- Support
- DataFrame
- Implements
{DataFrame,Series}.replace(#1762) - Add
{DataFrame, Series}.cartesian_chunksupport (#1774) - Integrate
str.catinto reduction and groupby-aggregation (#1776) - Implements reduction with
levelargument (#1779)
- Implements
Bug fixes
- Spawn serialization of executable graphs (#1769)
- Fix getitem on DataFrames with unknown index (#1772)
- Fix reading partitioned parquet files in HDFS (#1782)
- Fix creating Mars Series from empty pandas Series (#1787)
- Support md.concat on DataFrame and Series (#1798)
- Fix bug that explicit execute may be required for to_parquet and XGB predict (#1794)
- Fix TypeError when timeout argument is absent when starting Mars cluster in YARN (#1803, thanks @smartguo!)
Documentation
- Fill docs for
applyandtransform(#1764)
Tests
- Create different test workflows & fix accessor docs (#1799)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.7.0a1
This is the release notes of v0.7.0a1. See here for the complete list of solved issues and merged PRs.
Changes that break compatibility
- Aggregations and Groupby with aggregations have been rewritten in v0.6.0, older client may raise error when connecting to cluster with new version installed.
Highlights
- Statsmodels as well as joblib are preliminarily supported.
New Features
- DataFrame
- Support
num_partitionsargument for DataFrame initializers (#1729) - Add support for named aggregations (#1747)
- Support
- Tensor
- Add rebalance method for tensors (#1731)
- Learn
- Add preliminary statsmodels support (#1735)
- Add preliminary joblib support (#1757)
Bug fixes
- Fix
md.read_csvwhennamesandusecolsspecified (#1737) - Make PSRS chunks more balanced (#1742)
- Support string dtype for tensor reductions (#1745)
- Fix xgboost and lightgbm on DataFrames (#1750)
- Fix repeated execution of same code in distributed mode (#1749)
- Support setting scalar which is a tensor for DataFrame (#1755)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.0
This is the release notes of v0.6.0. See here for the complete list of solved issues and merged PRs.
This release note only covers the difference from v0.6.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:
alpha1 alpha2 alpha3 beta1 beta2 rc1
Changes that break compatibility
- Aggregations and Groupby with aggregations have been rewritten in v0.6.0, older client may raise error when connecting to cluster with new version installed.
New Features
- DataFrame
- Support
num_partitionsargument for DataFrame initializers (#1733) - Add support for named aggregations (#1748)
- Support
Enhancements
- Unify
groupby.agg()using ReductionCompiler (#1739)
Bug fixes
- Fix
md.read_csvwhennamesandusecolsspecified (#1738) - Support string dtype for tensor reductions & balance PSRS chunks (#1746)
- Fix XGBoost and LightGBM on DataFrames (#1751)
- Fix repeated execution of same code in distributed mode (#1753)
- Support setting scalar which is a tensor for DataFrame (#1758)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.0rc1
This is the release notes of v0.6.0rc1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
{DataFrame,Series}.explode(#1714)
- Implements
- Learn
- Support predicting on local LGBM models (#1716)
Enhancements
- Add configuration page on Mars Web (#1697)
- Add shared limit option for Mars worker (#1702)
- Remount the shm directory in
entrypoint.sh(#1700) - Add pure-dependent option for operands (#1706)
- Remove
prepare_inputsproperty on operands (#1709) - Use ReductionCompiler to support function aggregation in
mars.dataframe.reduction(#1705) - Write into and read from merged files when data sizes are small (#1708)
- Refactor builder and searcher of Proxima (#1710, thanks @rg070836rg!)
Bug fixes
- Fix mars not working on ray cluster (#1712, thanks @fyrestone!)
- Fix inferring dtype for
series.map(#1722) - Fix sort functions of DataFrames on CUDA (#1723)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.5.5
This is the release notes of v0.5.5. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
{DataFrame,Series}.explode(#1715)
- Implements
Enhancements
- Add configuration page on Mars Web (#1701)
- Remount
/dev/shmdirectory inentrypoint.shin Kubernetes and limit plasma size to avoid SIGBUS (#1703) - Add pure-dependent option for operands (#1707)
Bug fixes
- Fix the KeyError in
estimate_fuse_size(#1699) - Fix inferring dtype for
series.map(#1724) - Fix sort functions in CUDA (#1725)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.5.4
This is the release notes of v0.5.4. See here for the complete list of solved issues and merged PRs.
Enhancements
- Support
inplaceparameter inreset_indexmethod (#1663) - Add a threshold for
DataFrame.headoptimization (#1679)
Bug fixes
- Check unknown shape chunks in tile of
md.concat(#1656) - Fix hang for rerun
DataFrame.groupbyin distributed mode (#1669) - Create Fetch operands given output types (#1668)
- Modify
df.copy()so that it generates the identical key (#1678) - Fix
IndexErrorwhen binary op on Series whose type is datetime (#1680) - Mount /dev/shm on host to pods when starting Mars workers in Kubernetes (#1681)
- Fix DataFrame reduction that output type consistent for map and combine phase (#1686)
- Fix wrong dtypes of DataFrame
setitemchunks (#1691) - Fix assigning operands with expected workers (#1693)
- Add timeout for SharedHolderActor creation (#1692)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.0b2
This is the release notes of v0.6.0b2. See here for the complete list of solved issues and merged PRs.
New Features
- Learn
- Support recall computation for proxima (#1657, thanks @rg070836rg!)
Enhancements
- Support
inplaceparameter inreset_indexmethod (#1662) - Add a threshold for
DataFrame.headoptimization (#1673)
Bug fixes
- Check unknown shape chunks in tile of
md.concat(#1655) - Create Fetch operands given output types (#1666)
- Fix hang for rerun
DataFrame.groupbyin distributed mode (#1667) - Modify
df.copy()so that it generates the identical key (#1671) - Fix IndexError when binary op on Series whose type is datetime (#1675)
- Mount /dev/shm on host to pods when starting Mars workers in Kubernetes (#1677)
- Fix Series reduction that output type consistent for map and combine phase (#1685)
- Fix wrong dtypes of DataFrame
setitemchunks (#1690) - Add timeout for
SharedHolderActorcreation (#1684) - Fix assigning operands with expected workers (#1689)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.5.3
This is the release notes of v0.5.3. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add
DataFrame.to_parquetsupport (#1653)
- Add
Enhancements
- Optimize memory usage for brute-force algorithm in
NearestNeighbors(#1648)
Bug fixes
- Fix the wrong dtypes of
DataFrameSetitem's inputs (#1627) - Fix issue that
output_typedoes not take effect fordf.apply(#1628) - Fix registration for
DataFrameSetLabeloperand(#1633) - Eliminate TimeoutError when there are running nodes (#1639)
- Fix issue that serialization of transpose failed when input has unknown shape (#1638)
- Fix PSRS error when chunks has fewer rows than partition number (#1644)
- Fix
md.concatwhich may occupy huge amount of memory on client when all of DataFrames own largeRangeIndex(#1651)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.0b1
This is the release notes of v0.6.0b1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add
DataFrame.to_parquetsupport (#1652)
- Add
Enhancements
- Optimize memory usage for brute-force algorithm in
NearestNeighbors(#1640) - Structural adjustment for proxima (#1624, thanks @rg070836rg!)
Bug fixes
- Fix the wrong dtypes of
DataFrameSetitem's inputs (#1623) - Fix issue that
output_typedoes not take effect fordf.apply(#1626) - Fix registration for
DataFrameSetLabeloperand (#1631) - Fix issue that serialization of transpose failed when input has unknown shape (#1632)
- Eliminate TimeoutError when there are running nodes (#1637)
- Fix PSRS error when chunks has fewer rows than partition number (#1642)
- Add
flushmethod to_LogWrapper(#1646) - Fix
md.concatwhich may occupy huge amount of memory on client when all of DataFrames own largeRangeIndex(#1649)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.6.0a3
This is the release notes of v0.6.0a3. See here for the complete list of solved issues and merged PRs.
Highlights
- Brand-new API
fetch_logis implemented so that in a distributed environment, it helps users to fetch logs which output in custom functions without effort on client side. For more details, refer to #1564 .
New Features
- DataFrame
- Implements
df.rebalance()(#1572) - Add support for
{DataFrame,Series}.{where,mask}(#1577) - Add
read_parquetsupport (#1576) - Added
DataFrame.isinsupport (#1584) - Implements
DataFrame.stack(#1591) - Implements
{DataFrame,Series,GroupBy}.{all,any}(#1600) - Add support for pearson coefficients (
corr,corrwithandautocorr) (#1587)
- Implements
- Learn
- Integrate with
pyproxima2(#1618, thanks @rg070836rg!)
- Integrate with
- Deployment
- Support rescaling worker numbers in Kubernetes (#1571)
- Others
- Implements
fetch_logAPI (#1574)
- Implements
Bug fixes
- Fix the failure when fetching the result of
Series.sum(#1583) - Fix the failure of DataFrame reduction operators (#1589)
- Fix error on fitting
LGBMModeltwice (#1598) - Fix
train_test_splitwhen some input is Series (#1610) - Fix
build_faiss_indexwhen some index type cannot be merged (#1609) - Allow LightGBM wrapper to use numpy arrays (#1607)
- Add an extra sort key in PSRS to make distinct pivot (#1612)
- Fixes
md.read_csvwhendtypesis not inferred correctly (#1606) - Fix Ray 1.0 compatibility (#1620)
Documentation
- Add docs about reading data from HDFS (#1619)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.5.2
This is the release notes of v0.5.2. See here for the complete list of solved issues and merged PRs.
Highlights
- Brand-new API
fetch_logis implemented so that in a distributed environment, it helps users to fetch logs which output in custom functions without effort on client side. For more details, refer to #1564 .
New Features
- DataFrame
- Implements
df.rebalance()(#1573) - Add support for
{DataFrame,Series}.{where,mask}(#1579) - Add
read_parquetsupport (#1581) - Add
DataFrame.isinsupport (#1592) - Implements
DataFrame.stack(#1594) - Add support for
{DataFrame,Series,GroupBy}.{all,any}(#1601) - Add support for pearson coefficients (
corr,corrwithandautocorr) (#1616)
- Implements
- Others
- Implements
fetch_logAPI (#1582)
- Implements
Bug fixes
- Fix the failure when fetching the result of Series.sum (#1585)
- Fix failures of DataFrame reduction operators (#1595)
- Fix error on fitting
LGBMModeltwice (#1599) - Add extra sort key in PSRS to make distinct pivot (#1613)
- Fix
build_faiss_indexwhen some index type cannot be merged (#1614) - Fix
train_test_splitwhen some input is Series (#1615) - Fixes
md.read_csvwhendtypesis not inferred correctly (#1617)
- Python
Published by qinxuye over 5 years ago
https://github.com/mars-project/mars - v0.5.1
This is the release notes of v0.5.1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Allow submitting Mars jobs in custom functions (#1560)
- Add
df.select_dtypessupport (#1568, thanks @lipengsh!) - Add
df.map_chunksupport (#1570)
Enhancements
- Add configurable label options in Kubernetes cluster (#1550)
Bug fixes
- Use relative paths to avoid web rendering issues under backward proxies (#1545)
- Allow returning None when using
groupby.apply(#1548) - Fix bug that cannot pass numpy array to
mt.swapaxes(#1561, thanks @YoshieraHuang!) - Fix pandas 1.1.2 compatibility (#1563)
- Fix compatibility for tsfresh 0.17.0 (#1567)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.6.0a2
This is the release notes of v0.6.0a2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Allow submitting Mars jobs in custom functions (#1559)
- Add
df.select_dtypessupport (#1565, thanks @lipengsh!) - Add
df.map_chunksupport (#1569)
- Others
- Support running Mars under KubeDL (#1549)
Enhancements
- Add configurable label options in Kubernetes cluster (#1547)
Bug fixes
- Fix
md.read_csvfor Ray executor (#1541) - Allow returning None when using
groupby.apply(#1544) - Use relative paths to avoid web rendering issues under backward proxies (#1540)
- Fix bug that cannot pass numpy array to
mt.swapaxes(#1553, thanks @YoshieraHuang!) - Fix pandas 1.1.2 compatibility (#1562)
- Fix compatibility for tsfresh 0.17.0 (#1566)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.5.0
This is the release notes of v0.5.0. See here for the complete list of solved issues and merged PRs.
This release note only covers the difference from v0.5.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:
New Features
- DataFrame
- Support
use_arrow_dtypeformd.read_csvandmd.read_sql(#1495)
- Support
- Others
- Store and query graph information in batch (#1504)
Enhancements
- Fix compatibility for gevent>=20.5.1 (#1493)
- Add an option to control writing shuffle data into disk (#1516)
- Unify logic of modes including
eager,kernelandbuild(#1530) - Optimize
mars.learn.cluster.KMeanswhenn_clustersis relatively large (#1536)
Bug fixes
- Update
mt.splitto support list and tuple (#1509, thanks @YoshieraHuang!) - Fix pandas 1.1 compatibility (#1515)
- Fix
mt.isclosewhen some of the arguments is scalar (#1518) - Fix
mt.linalg.normwhen axis is negative (#1519, thanks @YoshieraHuang!) - Fix
arctan2when arguments contains scalar (#1520) - Unregister scheduler observer when destroying actors (#1526)
- Fix creating Mars DataFrame from an empty pandas DataFrame (#1531)
- Support
df.groupby().count()for arrow dtype with and without pyarrow installed (#1532) - Fix DataFrame reduction on GPU (#1535)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.4.7
This is the release notes of v0.4.7. See here for the complete list of solved issues and merged PRs.
New Features
- Store and query graph info in batch (#1503)
Enhancements
- Fix compatibility for gevent>=20.5.1 (#1494)
- Add an option to control writing shuffle data into disk (#1527)
Bug fixes
- Fix
arrow_array_to_objectswhen input is a Series whose index is not RangeIndex(n) (#1496)
Tests
- Fixed statsmodel version (#1537)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.6.0a1
This is the release notes of v0.6.0a1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Support
use_arrow_dtypeformd.read_csvandmd.read_sql(#1491)
- Support
- Others
- Store and query graph information in batch (#1501)
- Integrate with Ray (#1508)
Enhancements
- Fix compatibility for gevent>=20.5.1 (#1490)
- Add an option to control writing shuffle data into disk (#1513)
- Unify logics of modes including
eager,kernelandbuild(#1528) - Optimize
mars.learn.cluster.KMeanswhenn_clustersis relatively large (#1511)
Bug fixes
- Fix
mt.linalg.normwhen axis is negative (#1499, thanks @YoshieraHuang!) - Fix
mt.isclosewhen some of the arguments is scalar (#1498) - Fix
mt.arctan2when arguments contain scalar (#1502) - Update
mt.splitto support list and tuple (#1507, thanks @YoshieraHuang!) - Fix pandas 1.1 compatibility (#1437)
- Fix creating Mars DataFrame from an empty pandas DataFrame (#1522)
- Unregister scheduler observer when destroying actors (#1525)
- Support
df.groupby().count()for arrow dtype with and without pyarrow installed (#1523) - Fix DataFrame reduction on GPU (#1534)
Documentation
- Fix URL of contribution guide (#1505, thanks @StevenJokes!)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.4.6
This is the release notes of v0.4.6. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
md.qcut(#1473) - Implements
{DataFrame, Series}.reindex(#1483) - Add support for
ArrowListDtypeas well asArrowListArray(#1487)
- Implements
Enhancements
- Serialize results in worker before storing into shared storages (#1474)
- Raise timeout when assigning failed for a long time (#1477)
- Fix pickling arrow types & allow specifying parallel number in IO runners (#1482)
Bug fixes
- Support
ExtensionDtypeindf.astypeand complex serialization (#1464) - Fix incorrect
index_valueindf.drop()(#1488)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.5.0rc1
This is the release notes of v0.5.0rc1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
md.qcut(#1468) - Implements
{DataFrame, Series}.reindex(#1481) - Add support for
ArrowListDtypeas well asArrowListArray(#1486)
- Implements
Enhancements
- Serialize results in worker before storing into shared storages (#1470)
- Raise timeout when assigning failed for a long time (#1475)
- Use f-string to replace most of string formattings (#1484)
Bug fixes
- Fix reference cycle in
promise.all_(#1452) - Support
ArrowStringDtypeforDataFrame.sort_values()(#1455) - Support serializing complex scalars (#1459)
- Support
ExtensionDtypeindf.astype(#1462) - Fix incorrect
index_valueindf.drop()(#1466) - Fix pickling arrow types & allow specifying parallel number in IO runners (#1480)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.4.5
This is the release notes of v0.4.5. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add support for arrow-based string dtype (#1440)
- Add support for memory usage (#1447)
Bug fixes
- Fix failed when serializing
LearnShuffleoperand. (#1449) - Fix reference cycle in
promise.all_(#1456) - Fix kmeans hang for local cluster (#1446)
- Support
ArrowStringDtypeforDataFrame.sort_values()(#1457)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.5.0b3
This is the release notes of v0.5.0b3. See here for the complete list of solved issues and merged PRs.
Announcements
- From v0.5.0b3 on, v0.5.x series will no longer support Python 3.5, for Python 3.5 users, please use 0.4.x series.
New Features
- DataFrame:
- Add support for arrow-based string dtype (#1438)
- Support
memory_usageon DataFrame objects (#1217)
Bug fixes
- Fix crash when storing data inside Docker containers (#1429)
- Fix kmeans hang for local cluster (#1445)
- Fix failed when serializing
LearnShuffleoperand. (#1442)
Installation
- Drop support for python 3.5 (#1435)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.4.4
This is the release notes of v0.4.4. See here for the complete list of solved issues and merged PRs.
New Features
- Learn
- Add
mars.learn.cluster.KMeanssupport (#1428)
- Add
Enhancements
- Optimize
to_pandasandto_numpyetc that fetch first, if failed, callexecute().fetch()instead (#1410) - Create backup
CalcActorwhen spawning a new graph inmars.remote(#1412) - Skip rechunk when DataFrame has unknown shape in
sort_values(#1420)
Bug fixes
- Fix worker assign when no evaluation sets specified in LGBM training (#1408)
- Fix query alias & add estimation for object types (#1417)
- Fix the dtype of LightGBM model's predicted results (#1421)
- Fix the error raised when inferring dtype in DataFrame.transform (#1427)
- Fix crash when storing data inside Docker containers (#1432)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.5.0b2
This is the release notes of v0.5.0b2. See here for the complete list of solved issues and merged PRs.
New Features
- Learn
- Add
mars.learn.cluster.KMeanssupport (#1426)
- Add
Enhancements
- Optimize
to_pandasandto_numpyetc that fetch first, if failed, callexecute().fetch()instead (#1409) - Create backup
CalcActorwhen spawning a new graph inmars.remote(#1411) - Skip rechunk when DataFrame has unknown shape in
sort_values. (#1414)
Bug fixes
- Fix worker assign when no evaluation sets specified in LGBM training (#1405)
- Fix query alias & add estimation for object types (#1416)
- Fix the dtype of LightGBM model's predicted results. (#1419)
- Fix the error raised when inferring dtype in
DataFrame.transform(#1424)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.4.3
This is the release notes of v0.4.3. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implements
mars.tensor.stats.entropy(#1378)
- Implements
- DataFrame
- Implements
{DataFrame,Series,Index}.rename(#1366) - Implement
DataFrame.insert(#1392)
- Implements
- Learn
- Implements
mars.learn.model_selection.train_test_split(#1355)
- Implements
- Remote
- Add
run_scriptsupport (#1397)
- Add
Enhancements
- Optimize
DataFrame.{head, tail}when DataFrame has unknown chunk shape (#1360) - Make creation of Kubernetes clusters modular (#1373)
- Optimize
read_sql+head(#1379) - Optimize
read_csvif followed byDataFrame.getitem(#1398)
Bug fixes
- Remove reliance on
WHERE 1=0inread_sql(#1353) - Fix hang for distributed
roc_curve(#1367, #1387) - Fix
read_sqlwhen no data selected & refine error when no worker attached (#1374) - Fix progress display for bokeh 2.1.x (#1383)
- Fix serialize failed when FetchDataFrame's object_type is a list (#1386)
- Make local filesystem work when PyArrow not installed (#1391)
- Fix serialization issue when remote function has executed tileable arguments (#1400)
- Fix LightGBM when input tileables have unknown shape (#1399)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.5.0b1
This is the release notes of v0.5.0b1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implements
mars.tensor.stats.entropy(#1376)
- Implements
- DataFrame
- Implements
DataFrame.rename(#1359) - Implements
{Series,Index}.rename(#1361) - Implement
DataFrame.insert(#1389)
- Implements
- Remote:
- Add
run_scriptsupport (#1299)
- Add
Enhancements
- Set output type when calling new_xxx methods on DataFrames (#1212)
- Optimize
DataFrame.{head, tail}when DataFrame has unknown chunk shape (#1328) - Make creation of Kubernetes clusters modular (#1369)
- Optimize
read_sql+head(#1377) - Use subgraph to represent fused nodes instead of a list (#1388)
- Optimize
read_csvif followed byDataFrame.getitem(#1390)
Bug fixes
- Fix hang for distributed
roc_curve(#1362, #1380) - Fix
read_sqlwhen no data selected & refine error when no worker attached (#1371) - Fix progress display for bokeh 2.1.x (#1382)
- Fix serialization issue when remote function has argument which is an executed tileable (#1394)
- Fix LightGBM when input tileables have unknown shape (#1396)
Installation
- Specifying encoding for long_description (#1402)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.5.0a3
This is the release notes of v0.5.0a3. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add support for {DataFrame,Series,Index}.
drop(#1263) - Add {DataFrame,Series}.tosql() and Series.tocsv() (#1264)
- Implements
{DataFrame,Series,Index}.drop_duplicates(#1285) - Implements
DataFrame.melt(#1284) - Implements
md.read_sql_query(#1297) - Implements {Series,Index}.toframe() and Index.toseries() (#1317)
- Support setting columns for DataFrame (#1326)
- Add support for {DataFrame,Series,Index}.
- Learn
- Add MarsDistributor for
tsfreshlibrary (#1277) - Implements
mars.learn.model_selection.train_test_split(#1352)
- Add MarsDistributor for
- Remote
- Support tileables as arguments for spawned functions (#1296)
Enhancements
- Allow client-side to use pickle to serialize / deserialize tensor data (#1289)
- Support create session from environment variables (#1265)
Bug fixes
- Fix
NearestNeighborsthat run failed in cluster mode (#1262) - Fix graph hang on tile failure and execution failure (#1272)
- Fix failure when executing None-result spawn functions (#1276)
- Fix shape calculation in TensorIndex for
tensor.__setitem__(#1283) - Support fuse for Mars Remote (#1287)
- Fix
mt.linalg.normwhen chunk shape on axis > 1 (#1302) - Fix error in
calc_data_size()forGroupByWrapper(#1307) - Trigger execution in
check_consistent_lengthwhen arrays have unknown shape (#1321) - Fix wrong columns value in
reset_index(#1320) - Fix
build_dfwhen input DataFrame has duplicate columns (#1319) - Remove reliance on
WHERE 1=0inread_sql(#1335) - Make local filesystem work when PyArrow not installed (#1356)
Documentation
- Add docs for remote API, getting started as well as GPU integration (#1266)
- Use pydata-sphinx-theme for documentation (#1304)
Others
- Use latest pandas wheel for Python 3.8 (#1333)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.4.2
This is the release notes of v0.4.2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add support for {DataFrame,Series,Index}.
drop(#1268) - Add {DataFrame,Series}.tosql() and Series.tocsv() (#1267)
- Implements
{DataFrame,Series,Index}.drop_duplicates(#1292) - Implement
DataFrame.melt(#1295) - Implements
md.read_sql_query(#1300) - Implements {Series,Index}.toframe() and Index.toseries() (#1323)
- Support setting columns for DataFrame (#1327)
- Add support for {DataFrame,Series,Index}.
- Learn
- Add MarsDistributor for
tsfreshlibrary (#1281)
- Add MarsDistributor for
- Remote
- Support tileables as arguments for spawned functions (#1298)
Enhancements
- Allow client-side to use pickle to serialize / deserialize tensor data (#1291)
- Support create session from environment variables (#1322)
Bug fixes
- Fix
NearestNeighborsthat run failed in cluster mode (#1273) - Fix graph hang on tile failure and execution failure (#1275)
- Fix failure for None-result spawn functions (#1280)
- Fix shape calculation in TensorIndex for
tensor.__setitem__(#1293) - Support fuse for Mars Remote (#1294)
- Fix
mt.linalg.normwhen chunk shape on axis > 1 (#1303) - Trigger execution in
check_consistent_lengthwhen arrays have unknown shape (#1325) - Fix
build_dfwhen input DataFrame has duplicate columns (#1324) - Fix error in
calc_data_size()for GroupByWrapper (#1329) - Fix wrong columns value in
reset_index(#1330)
Documentation
- Add docs for remote API, getting started as well as GPU integration (#1274)
Others
- Use latest pandas wheel for Python 3.8 (#1332)
- Python
Published by qinxuye almost 6 years ago
https://github.com/mars-project/mars - v0.4.1
This is the release notes of v0.4.1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add
sizefunction for dataframes and groupbys (#1253) - Implements DataFrame.{iterrows, itertuples} (#1258)
- Add
- Learn
- Add support for LighGBM in Mars (#1254)
- Remote
- Support running tileables inside functions which spawned via
mr.spawn(#1257)
- Support running tileables inside functions which spawned via
Bug fixes
- Fix
.fetch()that may cause some op executed again (#1255) - Fix
df.describe()that failed when df has unknown shape and chunk size > 1 (#1256)
Tests
- Add checks for data consistency in learn module (#1259)
- Python
Published by qinxuye about 6 years ago
https://github.com/mars-project/mars - v0.5.0a2
This is the release notes of v0.5.0a2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add
sizefunction for dataframes and groupbys (#1250) - Implements DataFrame.{
iterrows,itertuples} (#1252)
- Add
- Learn
- Add support for LightGBM in Mars (#1244)
- Remote
- Support running tileables inside functions which spawned via
mr.spawn(#1248)
- Support running tileables inside functions which spawned via
Bug fixes
- Fix
.fetch()that may cause some op executed again (#1243) - Fix
df.describe()that failed when df has unknown shape and chunk size > 1 (#1249)
Tests
- Add checks for data consistency in learn module (#1246)
- Python
Published by qinxuye about 6 years ago
https://github.com/mars-project/mars - v0.4.0
This is the release notes of v0.4.0. See here for the complete list of solved issues and merged PRs.
This release note only covers the difference from v0.4.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:
Changes that break compatibility
- Calling
.execute()will no longer return numpy ndarray, pandas DataFrame and so forth, but will return Mars tensor, DataFrame itself instead. Only corner data will be fetched for display purpose. In order to explicitly convert to numpy ndarray, please call.to_numpy(), at the same time, call.to_pandas()to convert to pandas DataFrame. For more details, please refer to #1201.
Highlights
- Remote API is introduced and preliminarily supported in #1239, for more details, refer to proposal #1227.
New Features
- Tensor
- Implements
mt.trapz(#1223)
- Implements
- DataFrame
- Add support of {DataFrame,Series}.
ewm(#1198) - Add
dataframe.uniquesupport (#1225) - Implements
md.to_datetime, support__setitem__for DataFrame as well (#1226) - Add support for
Series.astypeandDataFrame.astype(#1237)
- Add support of {DataFrame,Series}.
- Learn
- Support Mars Series in PyTorch Dataset (#1194)
- Implements
mars.learn.metrics.{roc_curve, auc}(#1233)
- Others
- Add preliminary remote function support (#1239)
Enhancements
Tileable.execute()now will return Tileable itself, repr will act correctly (#1202)- Rename
LocalClusterSessiontoClusterSession(#1236)
Bug fixes
- Fix serialization for
mars.learn.utils.shuffle(#1193) - Fix error in starting local cluster with IPython & latest gevent version (#1234)
- Fix wrong result of column pruning (#1235)
- Python
Published by qinxuye about 6 years ago
https://github.com/mars-project/mars - v0.5.0a1
This is the release notes of v0.5.0a1. See here for the complete list of solved issues and merged PRs.
Changes that break compatibility
- Calling
.execute()will no longer return numpy ndarray, pandas DataFrame and so forth, but will return Mars tensor, DataFrame itself instead. Only corner data will be fetched for display purpose. In order to explicitly convert to numpy ndarray, please call.to_numpy(), at the same time, call.to_pandas()to convert to pandas DataFrame. For more details, please refer to #1201.
Highlights
- Remote API is introduced and preliminarily supported in #1238, for more details, refer to proposal #1227.
- Running on Yarn is preliminarily supported in #1210.
New Features
- Tensor
- Implements
mt.trapz(#1205)
- Implements
- DataFrame
- Add support of {DataFrame,Series}.
ewm(#1164) - Add
dataframe.uniquesupport (#1208) - Implements
md.to_datetime, support__setitem__for DataFrame as well (#1207) - Add support for
Series.astypeandDataFrame.astype(#1224)
- Add support of {DataFrame,Series}.
- Learn
- Support Mars Series in PyTorch Dataset (#1190)
- Implements
mars.learn.metrics.{roc_curve, auc}(#1220)
- Others
- Add preliminary support for Yarn (#1210)
- Add preliminary remote function support (#1238)
Enhancements
- Make
Tileable.execute()return tileable itself, fetching corner data only for correctrepr(#1201) - Allow some operands to fail fast (#1229)
- Rename
LocalClusterSessiontoClusterSession(#1230)
Bug fixes
- Fix serialization for
mars.learn.utils.shuffle(#1192) - Fix wrong result of column pruning (#1215)
- Fix error in starting local cluster with IPython (#1232)
Documentation
- Add learn docs (#1182)
- Add translation for learn docs (#1183)
- Add documentations for DataFrame arithmetic operands (#1191)
- Add logo in readme and docs (#1213)
Tests
- Workaround for upgraded tiledb (#1195)
- Python
Published by qinxuye about 6 years ago
https://github.com/mars-project/mars - v0.4.0rc1
This is the release notes of v0.4.0rc1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add support for
isna,notnaand__dir__(#1125) - Add support for
md.dropna(#1129) - Support
groupby.__getitem__and group by level (#1136) - Implement DataFrame
nunique(#1137) - Implements
md.cut(#1139) - Add
plotand relative functions for DataFrame and Series (#1143) - Implements
{DataFrame, Series}.{shift, tshift}(#1157) - Add support of
md.expanding(#1160) - Implements
{DataFrame,Series}.diff(#1174) - Support modulo operand for DataFrame (#1176)
- Add
Series.value_counts()support (#1181)
- Add support for
- Tensor
- Add support for
mt.union1d(#1147) - Support
Tensor.__setitem__with bool indexing (#1159)
- Add support for
- Learn
- Add support for
NearestNeighbors.kneighbors_graph(#1152) - Add support for
mars.learn.metrics.accuracy_score(#1150) - Implements
mars.learn.metrics.pairwise.rbf_kernel(#1158) - Implements
mars.learn.semi_supervised.LabelPropagation(#1163)
- Add support for
Enhancements
- Refactor GroupBy objects (#1127)
Bug fixes
- Support
md.mergewhenoncolumn is in df.index (#1132) - Fix tokenizing partial function (#1149)
- Allow retrieving shape of a groupby object (#1155)
Documentation
- Add DataFrame docs (#1130)
- Fix requirements for doc (#1135)
- Fix rendering numpy-style documentations (#1179)
- Fix some mistakes in the doc. (#1161, thanks @ueshin!)
Tests
- Check if
tileable.nsplitsandchunk.shapeis consistent (#1108) - Add meta checks for groupby (#1144)
- Allow using pyarrow==0.17.0 (#1172)
- Python
Published by qinxuye about 6 years ago
https://github.com/mars-project/mars - v0.3.4
This is the release notes of v0.3.4. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add support for
isna,notnaand__dir__(#1126) - Add support for
{DataFrame,Series}.agg(#1128) - Add support for
md.dropna(#1131) - Implements
{DataFrame, Series}.{shift, tshift}(#1168) - Add
plotand relative functions for DataFrame and Series (#1166) - Implement DataFrame
nunique(#1170) - Implements
{DataFrame,Series}.diff(#1177) - Support modulo operand for DataFrame (#1180)
- Add support for
- Tensor
- Add support for
mt.union1d(#1167) - Support
Tensor.__setitem__with bool indexing (#1169)
- Add support for
Bug fixes
- Support
md.mergewhenoncolumn is in df.index (#1165)
Tests
- Check if
tileable.nsplitsandchunk.shapeis consistent (#1133) - Allow pyarrow to use 0.17.0 (#1173)
- Python
Published by qinxuye about 6 years ago
https://github.com/mars-project/mars - v0.3.3
New Features
- Implements
atandiatfor DataFrame (#1105) - Implements Series.isin (for Series type). (#1106)
Enhancements
- Optimize performance of executor when running ops less than number of parallelism (#1099)
Bug fixes
- Fix validate_axis when input tileable has unknown shape (#1092)
- Support creating DataFrame from dict in which scalar exists (#1104)
- Support slice that can be integer or other types on non-int64 index (#1109)
Tests
- Check metadata consistency for output chunks and tileables (#1094)
- Python
Published by qinxuye about 6 years ago
https://github.com/mars-project/mars - v0.4.0b2
This is the release notes of v0.4.0b2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Support calling
df.agg()with lists or dicts for transform (#1093) - Implements
atandiatfor DataFrame (#1101) - Implements Series.isin (for Series type). (#1058)
- Support calling
Enhancements
- Optimize performance of executor when running ops less than number of parallelism (#1096)
Bug fixes
- Fix
validate_axiswhen input tileable has unknown shape (#1091) - Support creating DataFrame from dict in which scalar exists (#1098)
- Support slice that can be integer or other types on non-int64 index (#1103)
Tests
- Check metadata consistency for output chunks and tileables (#1071)
- Python
Published by qinxuye about 6 years ago
https://github.com/mars-project/mars - v0.3.2
This is the release notes of v0.3.2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implement md.{
cummax,cummin,cumprod,cumsum} (#1022) - Add support for
md.fillna(#1031) - Add
DataFrame.locsupport (#1060) - Add
DataFrame.rollingsupport (#1061) - Add support for GroupBy.{
cumcount,cummin,cummax,cumprod,cumsum} (#1072) - Support string and datetime methods via
Series.strandSeries.dtaccessor (#1074) - Implement dataframe
append(#1075) - Implement
DataFrame.concatandSeries.concat(#1078) - Add support for DataFrame.sort_values (#1081)
- Support
sort_indexfor DataFrame and Series (#1082) - Add
md.date_rangesupport (#1086) - Logical operators on DataFrame and Series. (#1088)
- Implements
head/tailbased oniloc, and fixes bug ingetitem. (#1089)
- Implement md.{
Enhancements
- Use
mapjointo optimize df.merge (#1023) - Refactor tiling of
DataFrame.ilocwithindex_lib(#1043) - Add
sort_range_indexparameter in readcsv (#1067)
Bug fixes
- Standardize RangeIndex for unknown shape DataFrame (#1066)
- Fix failed cases in distributed mode (#1079)
- Fix wrong dtypes in df.rechunk (#1083)
- Fix consistency between tensor metadata and real outputs (#1087)
Tests
- Fix tests under Python 3.6 as VS2015 is preinstalled (#1015)
- Python
Published by qinxuye about 6 years ago
https://github.com/mars-project/mars - v0.4.0b1
This is the release notes of v0.4.0b1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implement md.{
cummax,cummin,cumprod,cumsum} (#1019) - Implement dataframe
append(#1026) - Add support for
md.fillna(#1029) - Implement
DataFrame.concatandSeries.concat(#1040) - Support
groupby.aggwith list of functions (#1030) - Implement md.{DataFrame,Series,GroupBy}.
apply(#1038) - Add support for
DataFrame.sort_values(#1046) - Add
DataFrame.locsupport (#1042) - Add
DataFrame.rollingsupport (#1045) - Add support for {DataFrame,Series}.
agg(#1054) - Support string and datetime methods via
Series.strandSeries.dtaccessor (#1063) - Add support for GroupBy.{
cumcount,cummin,cummax,cumprod,cumsum} (#1069) - Support
sort_indexfor DataFrame and Series (#1053) - Add
md.date_rangesupport (#1073) - Logical operators on DataFrame and Series. (#1056)
- Implements
head/tailbased oniloc, and fixes bug ingetitem. (#1057)
- Implement md.{
- Others
- Add support for function serialization (#1048)
Enhancements
- Use
mapjointo optimizedf.merge(#1021) - Add
sort_range_indexparameter inread_csv(#1024) - Refactor tiling of
DataFrame.ilocwithindex_lib(#1016)
Bug fixes
- Fix KNN so that it can accept input with unknown shape (#1033)
- Support serializing
pd.Timestampandpd.Timedelta(#1065) - Fix failed cases in distributed mode (#1062)
- Fix wrong dtypes in
df.rechunk(#1080) - Fix failed fit method selection for KNN when input has unknown shape (#1050)
- Fix consistency between tensor metadata and real outputs (#1085)
Tests
- Fix tests under Python 3.6 as VS2015 is preinstalled (#1014)
- Python
Published by qinxuye about 6 years ago
https://github.com/mars-project/mars - v0.3.1
This is the release notes of v0.3.1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implements mt.{
topk,argsort,argpartition,argtopk} (#991) - Implement
imreadto read from images (#997)
- Implements mt.{
- DataFrame
- Support ufunc for Mars DataFrame (#967)
- Implements
DataFrame.to_csv(#992) - Implements DataFrame
dot,mulandpow(#994) - Implement dataframe
varandstd(#996) - Implements
describefor DataFrame (#998)
Enhancements
- Refactor tensor indexing (#1012)
Bug fixes
- Stop detecting GPU when no cuda devices are configured (#975)
- Fix wrong behavior of
choice(#993) - Make sure all kwargs are numpy types when inferring dtypes (#995)
- Fix wrong result of
count_nonzero(#1003) - Add
dtypeproperty forTensorImread(#1005) - Fix error when no device detected by CUDA driver (#1008)
Tests
- Fix failures in Windows tests (#939)
- Fix failed unittests due to release of pandas 1.0 (#965)
- Python
Published by qinxuye over 6 years ago
https://github.com/mars-project/mars - v0.4.0a2
This is the release notes of v0.4.0a2. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor:
- Add ability to read and write HDF5 file for tensor (#962)
- Implements mt.{
topk,argsort,argpartition,argtopk} (#946) - Support reading and writing in zarr format (#963)
- Implement
imreadto read from images (#988)
- DataFrame
- Support ufunc for Mars DataFrame (#957)
- Implements
DataFrame.to_csv(#966) - Implement dataframe
varandstd(#977) - Implements
Series.map(#979) - Implements DataFrame
dot,mulandpow(#980) - Implements
describefor DataFrame (#981) - Implements
md.read_sql_table(#986)
- Learn
- Implement PyTorch sampler to improve dataset performance (#970)
- Support
mars.learn.neighbors.NearestNeighbors(#961) - Leverage faiss to accelerate k-nearest neighbors calculation (#984)
- Implement pytorch sampler for local training (#1010)
Enhancements
- Refactor tensor indexing (#1011)
Bug fixes
- Fix tile in
nonzerothat tensor instead of tensor data should be used during the process (#954) - Fixes
cdist(x, y)that creates tensor with wrong nsplits (#960) - Fix the wrong
RangeIndexinread_csv(#930) - Stop detecting GPU when no cuda devices are configured (#973)
- Fix wrong behavior of
mt.random.choice(#976) - Make sure all kwargs are numpy types when inferring dtypes (#987)
- Fix error when
chunk_sizenot provided formd.read_sql_table(#990) - Fix wrong result of
count_nonzero(#1002) - Add
dtypeproperty forTensorImread(#1004) - Fix error when no device detected by CUDA driver (#1007)
Tests
- Fix failed unittests due to release of pandas 1.0 (#964)
- Hotfix opcodes that conflict (#968)
- Python
Published by qinxuye over 6 years ago
https://github.com/mars-project/mars - v0.4.0a1
This is the release notes of v0.4.0a1. See here for the complete list of solved issues and merged PRs.
Announcements
Due to the end-of-life (EOL) of Python 2 in January 1, 2020, from v0.4.0a1 on, v0.4.x series will no longer support Python 2, for Python 2.7 users, please use 0.3.x series.
Changes that break compatibility
- Operand now supports stages(#934), reduction operands as well as those operands whose tiled chunks contain map or reduce phases cannot be serialized between this and former versions.
New Features
- Tensor
- Implements
mt.histogramandmt.histogram_bin_edges(#876) - Add
mt.partitionsupport (#889) - Implements
mt.{percentile, quantile, median}(#898) - Support Einstein summation convention (#888)
- Add
mt.fill_diagonalsupport (#918) - Support
mars.tensor.spatial.distance.{pdist, cdist, squareform}(#894)
- Implements
- DataFrame
- Support creating DataFrame from dict whose values are tensors (#903)
- Support DataFrame and Series count (#900)
- Implement
meanoperator for DataFrame and Series (#907) - Implements
DataFrame.quantileandSeries.quantile(#911) - Add comparison functions for DataFrame (#921)
- Support
df.reset_indexandseries.reset_index(#915)
- Learn
- Add pairwise distances support for learn (#926)
- Implement
MarsDatasetto integrate with PyTorch (#937)
- Others
- Add function objects implementation for tokenizer (#893)
Enhancements
- Use default args for super() (#878)
- Skip preparing specified chunks when preparing for execution (#891)
- Accelerate
LUwhen input has one chunk (#905) - Add support for AnyReference in serialization (#874)
- Merge operands representing multiple stages of one single operand (#934)
Tests
- Add
TestExecutorthat serde graph every time when executing to ensure all operands work well with serialize (#880) - Fix possible failure of
testIterativeTilingWithoutEtcdfor Python 3.5 in CI (#896) - Switch coverage service to codecov (#909)
- Remove *_pb2.py to reduce chances of code conflict (#913)
- Fix failures in Windows tests (#938)
Others
- Drop support for Python 2 (#872)
- Further remove py27-related imports (#875)
- Python
Published by qinxuye over 6 years ago
https://github.com/mars-project/mars - v0.3.0
This is the release notes of v0.3.0. See here for the complete list of solved issues and merged PRs.
This release note only covers the difference from v0.3.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:
Announcements
From v0.3.0 on, v0.3.x will be the last series that support Python 2 until release of v0.4.0.
Changes that break compatibility
- Operand now supports stages(#935), reduction operands as well as those operands whose tiled chunks contain map or reduce phases cannot be serialized between this and former versions.
New Features
- Tensor
- Implements
mt.histogramandmt.histogram_bin_edges(#914) - Add
mt.partitionsupport (#916) - Implements
mt.{percentile, quantile, median}(#919) - Support Einstein summation convention (#925)
- Add
mt.fill_diagonalsupport (#931)
- Implements
- DataFrame
- Support creating DataFrame from dict whose values are tensors (#922)
- Implements
DataFrame.quantileandSeries.quantile(#924) - Support DataFrame and Series count (#923)
- Implement
meanoperator for DataFrame and Series (#927) - Add comparison operands for DataFrame (#929)
- Support
df.reset_indexandseries.reset_index(#933)
Enhancements
- Add public base class for entity data (#879)
- Merge operands representing multiple stages of one single operand (#935)
Bug fixes
- Fix sparse behavior for
tensor.minandtensor.max(#936)
Tests
- Add
TestExecutorthat serde graph every time when executing to ensure all operands work well with serialize (#881) - Fix possible failure of
testIterativeTilingWithoutEtcdfor Python 3 in CI (#906) - Remove
*_pb2.pyto reduce chances of code conflict (#920)
- Python
Published by qinxuye over 6 years ago
https://github.com/mars-project/mars - v0.3.0rc1
This is the release notes of 0.3.0rc1. See here for the complete list of solved issues and merged PRs.
Highlights
- Mars now can handle more cases that failed due to tensors with unknown chunk shapes via iterative tiling support introduced in #834.
- Python 3.8 wheels are supported in this release.
New Features
- Support iterative tiling (#834)
- Add experimental column pruning rules for tileable graph optimization (#865)
- Tensor
- Add
mt.sortsupport (#827)
- Add
- DataFrame
- Support DataFrame rechunk (#839)
- Support Series's setitem and getitem by iloc operation (#843)
- Add tree reduction method for DataFrame groupby aggregations (#850)
- Learn
- Add
mars.learn.datasets.samples_generator.make_blobsand update README (#845) - Support running PyTorch in Mars cluster via
run_pytorch_script(#861)
- Add
Enhancements
- Add ReceiverStatusActor to help listening at receiver end (#833)
- Assign enqueued operands immediately when no descendants are ready (#854)
- Support transferring multiple chunks at one time (#841)
Bug fixes
- Fix incorrect behavior of dataframe arithmetic (#838)
- Mark resource as processing once allocated (#848)
- Fix
read_csvexecution on GPU (#859) - Kill process tree when terminating a worker process (#864)
Tests
- Add separate environment to test HDFS (#829)
- Add CI/CD for Python 3.8 (#857)
- Fix distribute error under Py38 (#871)
- Python
Published by qinxuye over 6 years ago
https://github.com/mars-project/mars - v0.2.4
This is the release notes of v0.2.4. See here for the complete list of solved issues and merged PRs.
New Features
- Add
mt.sortsupport (#862) - Support DataFrame rechunk (#866)
- Support Series's setitem and getitem by iloc operation (#868)
- Add tree reduction method for DataFrame groupby aggregations (#869)
Enhancements
- Backport CUDA-related changes in utils (#846)
- Resolve compatibility issue for Python 3.8 (#858)
Bug fixes
- Fix incorrect behavior of dataframe arithmetic (#840)
- Mark resource as processing once allocated (#851)
- Kill process tree when terminating a worker process (#867)
- Fix
read_csvexecution on GPU (#870)
Tests
- Add separate environment to test HDFS (#835)
- Python
Published by qinxuye over 6 years ago
https://github.com/mars-project/mars - v0.2.3
This is the release notes of v0.2.3. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor:
- Add
mt.uniquesupport for tensor (#798)
- Add
- DataFrame
- Support DataFrame subtract operator (#800)
- Support conversion between series and tensor (#806)
- Refactor of DataFrame reduction and support more reduction operands (#816)
- Support DataFrame
read_csv(#826)
Enhancements
- Simplify tiles logic to improve its performance (#801)
- Return execution exception info properly to session client. (#821)
- Support
axisargument forpermutationandshuffle(#822) - Support
__iadd__etc by wrapaddwithoutargument (#824)
Bug fixes
- Correct type checking for DataFrame arithmetic (#819)
- Fix stuck issue of GeventThreadPoolExecutor (#823)
Tests
- Switch CI service to Github Actions (#794)
- Move tests in Appveyor into Github Actions (#797)
- Fix etcd cases under macOS Catalina (#811)
- Python
Published by qinxuye over 6 years ago
https://github.com/mars-project/mars - v0.3.0b2
This is the release notes of v0.3.0b2. See here for the complete list of solved issues and merged PRs.
Highlights
- Interoperability with XGBoost and TensorFlow are introduced:
mars.learn.contrib.xgboost.XGBClassifierandmars.learn.contrib.xgboost.XGBRegressorcan be used to do distributed classification and regression mission.mars.learn.contrib.tensorflow.run_tensorflow_scriptsupports running distributed TensorFlow 2.0 training in Mars cluster.
New Features
- Tensor
- Add
mt.uniquesupport for tensor (#783)
- Add
- DataFrame
- Support DataFrame subtract operator (#787)
- Support conversion between series and tensor (#791)
- Refactor of DataFrame reduction and support more reduction operands (#789)
- Support DataFrame
read_csv(#807)
- Learn
- Add XGBoost support (#769)
- Add ObjectData and ObjectChunk to represent data beyond ndarray, dataframe etc (#805)
- Add
mars.learn.utils.shuffleto support shuffling multiple tileable objects in a consistent way (#808) - Support running distributed TensorFlow 2.0 via
run_tensorflow_script(#820)
Enhancements
- Return execution exception info properly to session client (#770)
- Simplify tiles logic to improve its performance (#792)
- Support
axisargument forpermutationandshuffle(#803) - Support
__iadd__etc by wrapaddwithoutargument (#813) - Handle worker storage in batches (#818)
Bug fixes
- Correct type checking for DataFrame arithmetic (#815)
Tests
- Switch CI service to Github Actions (#793)
- Move tests in Appveyor into Github Actions (#795)
Others
- Bump copyright year to 2020 (#809)
- Python
Published by qinxuye over 6 years ago
https://github.com/mars-project/mars - v0.2.2
This is the release notes of v0.2.2. See here for the complete list of solved issues and merged PRs.
New Features
- Add multiple GPU support for local execution (#781)
- Implements
numpy.random.shuffleandnumpy.random.permutationfor tensor (#780) - Support DataFrame
groupby.agg(#782)
Enhancements
- Overhaul dataframe/series index alignment (#778)
Bug fixes
- Fix execution of arithmetic on GPU (#777)
- Python
Published by qinxuye over 6 years ago
https://github.com/mars-project/mars - v0.3.0b1
This is the release notes of v0.3.0b1. See here for the complete list of solved issues and merged PRs.
New Features
- Implements
numpy.random.shuffleandnumpy.random.permutationfor tensor (#762) - Add preliminary support for distributed execution with CUDA (#776)
- Add multiple GPU support for local execution (#779)
- Support DataFrame
groupby.agg(#767)
Enhancements
- Overhaul dataframe/series index alignment. (#737)
- Add support for controlling data copy across processes (#766)
Bug fixes
- Fix relocation of plasma error objects (#771)
- Fix execution of arithmetic on GPU (#775)
- Python
Published by qinxuye over 6 years ago
https://github.com/mars-project/mars - v0.2.1
This is the release notes of v0.2.1. See here for the complete list of solved issues and merged PRs.
New Features
- Add
to_gpuandto_cpusupport for both tensor and DataFrame (#706) - Access column using
__getattr__syntax for DataFrame (#746)
Enhancements
- Wait for graph to finish instead of querying with fixed intervals (#707)
- Spawn promise to utilize async network libs (#735)
- Submit metas obtained from schedulers (#741)
- Submit initial operands together in one RPC call (#745)
- Fuse some operations in cholesky's tile (#749)
- Simplify data transfer protocol (#744)
Bug fixes
- Separate flags for initials and terminals (#708)
- Remove redundant RPC calls for schedulers (#709)
- Fix incorrect chunk shape in QR (#722)
- Use cpuacct.stat to calculate cpu usage in Docker containers (#743)
- Processing
indexandcolumnsseperately (and correctly) infrom_tensor(#747) __setitem__on a view should be still a view (#748)
- Python
Published by wjsi over 6 years ago
https://github.com/mars-project/mars - v0.3.0a2
This is the release notes of v0.3.0a2. See here for the complete list of solved issues and merged PRs.
New Features
- Add
to_gpuandto_cpusupport for both tensor and DataFrame (#630) - Access column using
__getattr__syntax for DataFrame (#712)
Enhancements
- Move related files to optimizes module (#640)
- Add option for plasma path (#699)
- Wait for graph to finish instead of querying with fixed intervals (#701)
- Submit initial operands together in one RPC call (#711)
- Add lock free option for workers (#716)
- Implements more flexible
tileable.cix[](#731) - Submit metas obtained from schedulers (#727)
- Spawn promise to utilize async network libs (#725)
- Simplify data transfer protocol (#736)
- Fuse some operations in cholesky's tile (#742)
Bug fixes
- Separate flags for initials and terminals for operands (#703)
- Remove redundant RPC calls for schedulers (#705)
- Fix incorrect chunk shape in QR decomposition (#719)
__setitem__on a view should be still a view (#733)- Processing
indexandcolumnsseperately (and correctly) infrom_tensor(#723) - Add a config to use cpuacct.stat to calculate cpu usage (#740)
- Fix race condition when starting tasks and adding callbacks (#755)
- Python
Published by wjsi over 6 years ago