HiPart
HiPart: Hierarchical Divisive Clustering Toolbox - Published in JOSS (2023)
Science Score: 93.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 7 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Scientific Fields
Repository
Hierarchical divisive clustering algorithm execution, visualization and Interactive visualization.
Basic Info
- Host: GitHub
- Owner: panagiotisanagnostou
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://hipart.readthedocs.io/
- Size: 151 MB
Statistics
- Stars: 52
- Watchers: 8
- Forks: 8
- Open Issues: 1
- Releases: 19
Topics
Metadata Files
README.md
HiPart: Hierarchical divisive clustering toolbox
This repository presents the HiPart package, an open-source native python library that provides efficient and interpretable implementations of divisive hierarchical clustering algorithms. HiPart supports interactive visualizations for the manipulation of the execution steps allowing the direct intervention of the clustering outcome. This package is highly suited for Big Data applications as the focus has been given to the computational efficiency of the implemented clustering methodologies. The dependencies used are either Python build-in packages or highly maintained stable external packages. The software is provided under the MIT license.
Installation
For the installation of the package, the only necessary actions and requirements are a version of Python higher or equal to 3.8 and the execution of the following command.
bash
pip install HiPart
Simple Example Execution
The example bellow is the simplest form of the package's execution. Shortly, it shows the creation of synthetic clustering dataset containing 6 clusters. Afterwards it is clustered with the DePDDP algorithm and only the cluster labels are returned.
```python from HiPart.clustering import DePDDP from sklearn.datasets import make_blobs
X, y = makeblobs(nsamples=1500, centers=6, random_state=0)
clusteredclass = DePDDP(maxclustersnumber=6).fitpredict(X) ```
The HiPart package offers a comprehensive suite of examples to guide users in utilizing its various algorithms. These examples are conveniently located in the repository's examples directory.
For a general understanding of the package's capabilities, users can refer to the clustering_example file. This file serves as a foundational guide, providing complete examples of the package's algorithms in action.
Additionally, for those interested in incorporating KernelPCA methods, the clusteringwithkpca_example file is an invaluable resource. It offers a detailed example of how to apply KernelPCA within the context of the HiPart package.
Recognizing the importance of clustering via similarity or dissimilarity matrices, such as distance matrices, the HiPart package includes the clusteringwithdistancematrixexample file. This specific example demonstrates the use of the DePDDP algorithm with a distance matrix, offering a practical application scenario.
Lastly, the package features an interactive visualization component, which is exemplified in the interactivevisualizationexample file. This example not only showcases the execution of the interactive visualization but also provides comprehensive instructions for navigating the visualization GUI.
These resources collectively ensure that users of the HiPart package have a well-rounded and practical understanding of its functionalities and applications.
Documentation
The full documentation of the package can be found here.
Citation
bibtex
@article{Anagnostou2023HiPart,
title = {HiPart: Hierarchical Divisive Clustering Toolbox},
author = {Panagiotis Anagnostou and Sotiris Tasoulis and Vassilis P. Plagianakos and Dimitris Tasoulis},
year = {2023},
journal = {Journal of Open Source Software},
publisher = {The Open Journal},
volume = {8},
number = {84},
pages = {5024},
doi = {10.21105/joss.05024},
url = {https://doi.org/10.21105/joss.05024}
}
Acknowledgments
This project has received funding from the Hellenic Foundation for Research and Innovation (HFRI), under grant agreement No 1901.
Collaborators
Dimitris Tasoulis :email: Panagiotis Anagnostou :email: Sotiris Tasoulis :email: Vassilis Plagianakos :email:
Owner
- Name: Panagiotis Anagnostou
- Login: panagiotisanagnostou
- Kind: user
- Repositories: 1
- Profile: https://github.com/panagiotisanagnostou
JOSS Publication
HiPart: Hierarchical Divisive Clustering Toolbox
Authors
Department of Computer Science and Biomedical Informatics, University of Thessaly, Greece
Department of Computer Science and Biomedical Informatics, University of Thessaly, Greece
Department of Computer Science and Biomedical Informatics, University of Thessaly, Greece
Signal Ocean SMPC, Greece
Tags
Clustering High dimensionality Machine LearningGitHub Events
Total
- Release event: 2
- Watch event: 10
- Push event: 9
- Pull request event: 3
- Fork event: 1
- Create event: 2
Last Year
- Release event: 2
- Watch event: 10
- Push event: 9
- Pull request event: 3
- Fork event: 1
- Create event: 2
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Panagiotis Anagnostou | p****o@u****r | 142 |
| panagiotis40 | p****0 | 4 |
| Steve Stavropoulos | s****e@m****r | 3 |
| Julien Jerphanion | g****t@j****z | 3 |
| nicospavlidis | n****s@g****m | 1 |
| Steve Stavropoulos | s****e@n****r | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 3
- Total pull requests: 32
- Average time to close issues: about 14 hours
- Average time to close pull requests: about 19 hours
- Total issue authors: 3
- Total pull request authors: 5
- Average comments per issue: 0.33
- Average comments per pull request: 0.59
- Merged pull requests: 30
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: 32 minutes
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.5
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- etsakanika (1)
- JohnNellas (1)
- panagiotisanagnostou (1)
- Petros-Barmpas (1)
Pull Request Authors
- panagiotisanagnostou (29)
- stevestavropoulos (4)
- jjerphan (2)
- jbytecode (1)
- nicospavlidis (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 149 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 18
- Total maintainers: 1
pypi.org: hipart
A hierarchical divisive clustering toolbox
- Homepage: https://github.com/panagiotisanagnostou/HiPart
- Documentation: https://hipart.readthedocs.io/en/latest/
- License: MIT License
-
Latest release: 1.0.6
published 8 months ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v3 composite
- actions/setup-python v3 composite
- codecov/codecov-action v2 composite
- actions/checkout v3 composite
- actions/setup-python v3 composite
- pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
- actions/checkout v3 composite
- actions/setup-python v3 composite
- HiPart *
- sphinx_rtd_theme *
- dash >=2.0
- kdepy *
- matplotlib *
- numpy *
- plotly *
- scikit-learn *
- scipy <=1.15.3
- statsmodels >=0.13
- treelib >=1.6