Recent Releases of https://github.com/awslabs/amazon-denseclus
https://github.com/awslabs/amazon-denseclus - v0.2.2
- Updated
evaluatehelper function to do both DBCV and Calinski-Harabasz - Added new Notebook for exploring clustering on SageMaker Jumpstart
- Dependency version bumps
- Jupyter Notebook
Published by momonga-ml over 2 years ago
https://github.com/awslabs/amazon-denseclus - v0.2.1
- splitting up modules - numerical and categorical have there own files now for future enhancements
- changed
scoremethod toevaluate; now scores via DBCV, coverage and return lables - set gpu settings consolidated, now just
use_gpuset to False or true - add version file for automated setup
- Jupyter Notebook
Published by momonga-ml over 2 years ago
https://github.com/awslabs/amazon-denseclus - v0.2.0
Summary
Add predict method based on the combine method for ensemble.
When ensemble is selected, Denseclus does not combine the umaps, instead it fits clusterer for each UMAP.
When predict is called it used approximate_predict in HDSCAN to then vote on the cluster assignment.
Other changes
- Change default method from 'contrast' to 'intersection'
- Change default distance metric for categoricals to
jaccardfor later rapids integration - Increase overall test coverage
prediction_data=Falsefor combined UMAPs,Truefor ensemble- Update examples to reflect changes
- Jupyter Notebook
Published by momonga-ml over 2 years ago
https://github.com/awslabs/amazon-denseclus - v0.1.2
A few minor tweaks to the library primarily to help with maintenance.
1) Adding Continuous Deployment CD workflow to directly publish to PyPI when merged into main
2) Fixed __repr__ and __str__ methods so the don't return the whole fitted dataframe
3) Fixed coverage runs and made tox a single call
- Jupyter Notebook
Published by momonga-ml over 2 years ago
https://github.com/awslabs/amazon-denseclus - v0.1.1
Adding feature to auto-impute. Will call simple imputation under the hood for both categorical and numerical features. The user can configure these to non-defaults with keyword arguments.
In addition, updated the HDBSCAN so that parameter search comes first as DenseClus converges to the optimal solution for DBCV. I don't know why.
PS: Really should be semantic version 2 but I am going this route instead.
https://github.com/awslabs/amazon-denseclus/issues/23
- Jupyter Notebook
Published by momonga-ml over 2 years ago
https://github.com/awslabs/amazon-denseclus - v0.1.0
Description of changes:
** New Feature: Configure underlying Algorithms**
Update: Now Supported for Python 3. 11 (and only Python 3.11)
Other Updates * Move to using Ruff for linting * Address some bugs and user warnings in the package code * Update and lint notebooks * Refactor unit tests with fixtures * Update tox, precommit, etc to run on latest Python * Refactor of Makefile to support all above * Better error handling * Update workflows in GHA to remove redudancy * Better issues tracking templates
- Jupyter Notebook
Published by momonga-ml over 2 years ago