Recent Releases of https://github.com/bsc-wdc/dislib

https://github.com/bsc-wdc/dislib - v0.9.0

New features

  • New RandomSVD algorithm
  • New LanczosSVD algorithm
  • New distributed versions of Random Forest Classifier and Random Forest Regressor
  • New nested versions of Random Forest Classifier and Random Forest Regressor
  • Included a version of TeraSort algorithm

Changed

  • New documentation for SVD algorithm, RF and TeraSort

Fixed

  • Fix bugs & tests

- Python
Published by cTatu over 2 years ago

https://github.com/bsc-wdc/dislib - v0.8.0

New features

  • save and load methods for all models
  • Adding Muliclass CSVM
  • Adding TS-QR (Tall Skinny QR)
  • New in-place operations for ds-arrays: add iadd isub
  • Matrix-Subtraction and Matrix-Addition
  • Concatenating two ds-arrays by columns
  • Save ds-array to npy file
  • Load ds-array from several npy files
  • Create ds-arrays from blocks
  • GridSearch for simulations & improvements
  • Inverse transformation in Scalers
  • Train-Test-Split functionality
  • Add KNN Classifier
  • Better SVD columns pairing
  • GPU Support using CUDA/CuPy for algorithms: Kmeans, KNN, SVD, PCA, Matmul, Addition, Subtraction, QR, Kronecker

Changed

  • New documentation for GPU, RandomForest, Scalers

Fixed

  • Fix bug Scalers & tests

- Python
Published by cTatu over 3 years ago

https://github.com/bsc-wdc/dislib - v0.7.1

What's Changed

0.7.0 + documentation fix

Full Changelog: https://github.com/bsc-wdc/dislib/compare/v0.7.0...v0.7.1

- Python
Published by cTatu over 4 years ago

https://github.com/bsc-wdc/dislib - v0.7.0

New features

  • QR decomposition
  • Random Forest regressor
  • MinMax scaler
  • Matrix multiplication with transposed arguments
  • several utility functions to pad matrices, or to remove last rows/columns

Improvements

  • improved performance of SVD
  • computing units for each task

- Python
Published by michal-choinski over 4 years ago

https://github.com/bsc-wdc/dislib - v0.6.4

Dependencies

  • PyCOMPSs >= 2.7
  • Scikit-learn >= 0.19.2
  • NumPy >= 1.15.4
  • Scipy >= 1.0.0
  • cvxpy>=1.1.5

Improvements

  • SVD doc example fixed.
  • LR example fixed.
  • Warn when cvxpy dependency missing (for mn4 installation).
  • Added link to Contributing guide in docs

- Python
Published by salvisolamartinell over 5 years ago

https://github.com/bsc-wdc/dislib - v0.6.3

Dependencies

  • PyCOMPSs >= 2.7
  • Scikit-learn >= 0.19.2
  • NumPy >= 1.15.4
  • Scipy >= 1.0.0
  • cvxpy>=1.1.5

Improvements

  • PyPI long_description shortened.

- Python
Published by salvisolamartinell over 5 years ago

https://github.com/bsc-wdc/dislib - v0.6.2

Dependencies

  • PyCOMPSs >= 2.7
  • Scikit-learn >= 0.19.2
  • NumPy >= 1.15.4
  • Scipy >= 1.0.0
  • cvxpy>=1.1.5

Improvements

  • Added extra info for PyPI

- Python
Published by salvisolamartinell over 5 years ago

https://github.com/bsc-wdc/dislib - v0.6.1

Dependencies

  • PyCOMPSs >= 2.7
  • Scikit-learn >= 0.19.2
  • NumPy >= 1.15.4
  • Scipy >= 1.0.0
  • cvxpy>=1.1.5

Improvements

  • Documentation fixes.

- Python
Published by salvisolamartinell over 5 years ago

https://github.com/bsc-wdc/dislib - v0.6.0

Dependencies

  • PyCOMPSs >= 2.7
  • Scikit-learn >= 0.19.2
  • NumPy >= 1.15.4
  • Scipy >= 1.0.0
  • cvxpy>=1.1.5

Upgrade Steps

If using docker, just use the new image.

If you have a local installation, upgrade to COMPSs 2.7 (see COMPSs doc) before upgrading to dislib 0.6.0. Also, install the Python cvxpy module in order to use the regression algorithms: pip install cvxpy.

Breaking Changes

  • ds-array doesn't accept a chunk_size bigger than the array.
  • Moved data loading routines to a different file as array.py was getting too big.
  • applyalongaxis for sparse data now returns sparse ds-arrays.
  • Some PyCOMPSs log messages have changed.

New Features

  • User guide and glossary
  • Method to read from npy files
  • Support for one-dimensional data in ds-array
  • Parametrized ds-array tests
  • identity, full and zeros methods that generate ds-arrays filled with a value
  • ds-array operators: subtraction, division, conjugate, transpose, item setting, etc.
  • matmul, kronecker product and rechunk methods for of ds-arrays
  • Automatic deletion of ds-arrays when the GC is called.
  • Multivariate linear regression.
  • SVD (Singular Value Decomposition)
  • PCA using SVD
  • ADMM Lasso algorithm
  • Daura clustering algorithm

Bug Fixes

  • Some bugs in the ds-array
  • Internal inconsistencies in transformed_array of PCA

Improvements

  • Improved performance testing scripts and added new tests
  • Allow executing applications with params using dislib exec
  • Extended and improved the tutorial notebook
  • Updated dislib-base docker image
  • Replaced COLLECTIONINOUT parameters with COLLECTIONOUT when possible for improving performance

- Python
Published by salvisolamartinell over 5 years ago

https://github.com/bsc-wdc/dislib - v0.5.0

Dependencies

  • PyCOMPSs == 2.5
  • Scikit-learn >= 0.19.2
  • NumPy >= 1.15.4
  • Scipy >= 1.0.0

New Features

  • Added grid search and randomized search with cross-validation
  • Added K-fold splitter
  • dislib command line can now run jupyter notebooks

Bug Fixes

  • Fixed various bugs in fancy indexing of ds-arrays
  • dislib command line now works on MacOS
  • Fixed "source" links in the documentation to point to the appropriate version of the source code
  • dislib command line now works even if PyCOMPSs is not installed

Improvements

  • Added a new notebook and improved the existing one
  • PCA now supports sparse data
  • Estimators now extend scikit-learn's base estimator for greater integration

- Python
Published by javicid over 6 years ago

https://github.com/bsc-wdc/dislib - v0.4.3

Dependencies

  • PyCOMPSs == 2.5
  • Scikit-learn >= 0.19.2
  • NumPy >= 1.15.4
  • Scipy >= 1.0.0

Improvements

  • Installing dislib via pip now automatically places the dislib executable in the PATH.

- Python
Published by javicid over 6 years ago

https://github.com/bsc-wdc/dislib - v0.4.0

Dependencies

  • PyCOMPSs == 2.5
  • Scikit-learn >= 0.19.2
  • NumPy >= 1.15.4
  • Scipy >= 1.0.0

Breaking Changes

  • Most estimator methods, such as fit and predict, now expect one or two ds-arrays instead of a Dataset.

New Features

  • This release introduces the distributed array as the main data structure in dislib. All estimators have been modified to accept ds-arrays instead of Datasets. The Dataset and Subset classes have been removed.

Bug Fixes

  • Minor bug fixes in RandomForestClassifier and K-means

Improvements

  • The performance of various algorithms has been improved by using PyCOMPSs COLLECTIONS.
  • K-means now accepts an 'init' parameter.

- Python
Published by javicid over 6 years ago

https://github.com/bsc-wdc/dislib - v0.3.0

Dependencies

  • PyCOMPSs == 2.5
  • Scikit-learn >= 0.19.2
  • NumPy >= 1.15.4
  • Scipy >= 1.0.0

New Features

  • GaussianMixture now supports covariance types 'tied', 'diag', and 'spherical' apart from 'full'.
  • dislib now provides PCA and LinearRegression models.

Bug Fixes

  • Fixed DBSCAN to be able to detect clusters with less than min_samples samples, and to be able to detect clusters that lie in the intersection of two regions.

Improvements

  • The GaussianMixture documentation has been improved.
  • Extra tests for GaussianMixture, C-SVM and DBSCAN have been added.
  • The performance of K-means, DBSCAN and GaussianMixtures has been significantly improved.
  • The performance of utils.shuffle has been improved by using PyCOMPSs collections.
  • The performance of Dataset has been improved by removing the tracking of duplicates.

- Python
Published by javicid almost 7 years ago

https://github.com/bsc-wdc/dislib - v0.2.1

Dependencies

(Update dependency versions if required)

  • PyCOMPSs >= 2.4-rc1902
  • Scikit-learn >= 0.19.1
  • NumPy >= 1.15.4
  • Scipy >= 1.0.0

Bug Fixes

  • DBSCAN now detects clusters with less than min_samples in certain situations

Improvements

  • The performance of DBSCAN has been improved

- Python
Published by javicid almost 7 years ago

https://github.com/bsc-wdc/dislib -

Dependencies

  • PyCOMPSs == 2.4-rc1902
  • Scikit-learn => 0.19.1
  • NumPy => 1.15.4
  • Scipy => 1.0.0

Upgrade Steps

Breaking Changes

  • predict and fit_predict methods in K-means, DBSCAN and C-SVM now take a Dataset as argument and do not return anything

New Features

  • The following new algorithms have been implemented:

    • Gaussian mixtures
    • Nearest neighbors
    • Alternating least squares
    • Standard scaler
  • Added the following utility methods:

    • resample
    • shuffle
    • as_grid

Bug Fixes

  • Numerous bug fixes in DBSCAN.
  • Fixed the reproducibility of results in C-SVM and random forests
  • Several other minor bug fixes

Improvements

  • Completely unified the interface of the different algorithms
  • Improved the documentation
  • Added a way to easily access Dataset samples and labels
  • Implemented Dataset's transpose
  • Implemented Dataset's apply function

- Python
Published by javicid over 7 years ago

https://github.com/bsc-wdc/dislib -

This release has been tested with COMPSs version rc1902.

- Python
Published by javicid over 7 years ago

https://github.com/bsc-wdc/dislib - Initial Release

- Python
Published by kafkasl over 7 years ago