mst_clustering
mst_clustering: Clustering via Euclidean Minimum Spanning Trees - Published in JOSS (2016)
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 7 DOI reference(s) in README -
✓Academic publication links
Links to: joss.theoj.org, zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.8%) to scientific vocabulary
Scientific Fields
Repository
Scikit-learn style estimator for Minimum Spanning Tree Clustering in Python
Basic Info
Statistics
- Stars: 85
- Watchers: 6
- Forks: 19
- Open Issues: 1
- Releases: 2
Metadata Files
README.md
Minimum Spanning Tree Clustering
This package implements a simple scikit-learn style estimator for clustering with a minimum spanning tree.
Motivation
Automated clustering can be an important means of identifying structure in data, but many of the more popular clustering algorithms do not perform well in the presence of background noise. The clustering algorithm implemented here, based on a trimmed Euclidean Minimum Spanning Tree, can be useful in this case.
Example
The API of the mst_clustering code is designed for compatibility with
the scikit-learn project.
```python from mstclustering import MSTClustering from sklearn.datasets import makeblobs import matplotlib.pyplot as plt
create some data with four clusters
X, y = makeblobs(200, centers=4, randomstate=42)
predict the labels with the MST algorithm
model = MSTClustering(cutoffscale=2) labels = model.fitpredict(X)
plot the results
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='rainbow'); ```

For a detailed explanation of the algorithm and a more interesting example of it in action, see the MST Clustering Notebook.
Installation & Requirements
The mst_clustering package itself is fairly lightweight. It is tested on
Python 2.7 and 3.4-3.5, and depends on the following packages:
Using the cross-platform conda package manager, these requirements can be installed as follows:
$ conda install numpy scipy scikit-learn
Finally, the current release of mst_clustering can be installed using pip:
$ conda install pip # if using conda
$ pip install mst_clustering
To install mst_clustering from source, first download the source repository and then run
$ python setup.py install
Contributing & Reporting Issues
Bug reports, questions, suggestions, and contributions are welcome. For these, please make use the Issues or Pull Requests associated with this repository.
Citing
If you use this code in an academic publication, please consider citing this JOSS Paper.
Owner
- Name: Jake Vanderplas
- Login: jakevdp
- Kind: user
- Location: Oakland CA
- Company: Google
- Website: http://www.vanderplas.com
- Twitter: jakevdp
- Repositories: 220
- Profile: https://github.com/jakevdp
Python ML & Data Science
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Jake VanderPlas | j****p@g****m | 36 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- ktsakos (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 901 last-month
- Total dependent packages: 0
- Total dependent repositories: 5
- Total versions: 3
- Total maintainers: 1
pypi.org: mst_clustering
Clustering with Minimum Spanning Trees
- Homepage: http://github.com/jakevdp/mst_clustering
- Documentation: https://mst_clustering.readthedocs.io/
- License: new BSD
-
Latest release: 0.1.dev0
published over 2 years ago