classix

Fast and explainable clustering in Python

https://github.com/nla-group/classix

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.2%) to scientific vocabulary

Keywords

algorithm classification clustering cython data-analysis data-mining data-science machine-learning unsupervised-learning unsupervised-machine-learning visualization
Last synced: 6 months ago · JSON representation

Repository

Fast and explainable clustering in Python

Basic Info
Statistics
  • Stars: 116
  • Watchers: 3
  • Forks: 12
  • Open Issues: 1
  • Releases: 25
Topics
algorithm classification clustering cython data-analysis data-mining data-science machine-learning unsupervised-learning unsupervised-machine-learning visualization
Created over 4 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Code of conduct Citation

README-ch.md

es ch

CLASSIX:

!pypi Static Badge Anaconda-Server Badge codecov License: MIT azure Conda Platforms DOI

CLASSIX

-

-

  • Cython

CLASSIX CLustering by Aggregation with Sorting-based Indexing (X for explainability)


PIP Conda CLASSIX

| PyPI | conda-forge | | :---: |:---: | |PyPI | conda-forge | | NumPy<=1.26.4: pip install classixclustering
NumPy>2: pip install classixclustering --no-cache-dir| conda install -c conda-forge classixclustering |

pip install classixclustering --no-cache-dir


____ | ____
:---:|:---: Python | NumPy SciPy PandasMatplotlib


CLASSIX

```python import classix

data, labels = classix.loadData('Covid3MC')

from sklearn.datasets import make_blobs

data, labels = makeblobs(nsamples=1000, centers=3, nfeatures=2, randomstate=0)

CLASSIX

clx = classix.CLASSIX(radius=0.2, minPts=500, verbose=0) clx.fit(data) print(clx.labels_) # ``` predict() clx.predict(data.iloc[:1000])


CLASSIX


  1. - radius minPts



    • Cython



1.

CLASSIX 2D

```python import classix import numpy as np import matplotlib.pyplot as plt

from sklearn.datasets import makemoons data, _ = makemoons(n_samples=1000, noise=0.05)

clx = classix.CLASSIX(radius=0.15, minPts=10, verbose=0) clx.fit(data)

plt.scatter(data[:, 0], data[:, 1], c=clx.labels_, cmap='viridis', s=5) plt.show() ```

2.

CLASSIX

```python from sklearn.datasets import makeblobs data, labels = makeblobs(nsamples=5000, centers=5, nfeatures=20, random_state=0)

clx = classix.CLASSIX(radius=10, minPts=100, verbose=1) clx.fit(data) print(clx.labels_) ```

3.

CLASSIX

```python from sklearn.datasets import makeblobs data, _ = makeblobs(nsamples=500, centers=3, nfeatures=2, random_state=0)

np.random.seed(0) outliers = np.random.uniform(low=-10, high=10, size=(50, 2)) datawithoutliers = np.vstack([data, outliers])

clx = classix.CLASSIX(radius=0.5, minPts=10, verbose=1) clx.fit(datawithoutliers)

(-1 )

print(clx.labels_) ```

, git clone https://github.com/nla-group/classix.git cd classix

python -m venv env source env/bin/activate # Windows: env\Scripts\activate pip install .

pytest unittests.py

 GitHub Issues 
stefan.guettel@manchester.ac.uk

CLASSIX @article{CG24, title = {Fast and explainable clustering based on sorting}, author = {Xinye Chen and Stefan Gttel}, journal = {Pattern Recognition}, volume = {150}, pages = {110298}, year = {2024}, doi = {https://doi.org/10.1016/j.patcog.2024.110298} }

Owner

  • Name: nla-group
  • Login: nla-group
  • Kind: organization

GitHub Events

Total
  • Release event: 1
  • Watch event: 14
  • Push event: 137
  • Fork event: 3
  • Create event: 3
Last Year
  • Release event: 1
  • Watch event: 14
  • Push event: 137
  • Fork event: 3
  • Create event: 3

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 617
  • Total Committers: 3
  • Avg Commits per committer: 205.667
  • Development Distribution Score (DDS): 0.152
Past Year
  • Commits: 167
  • Committers: 3
  • Avg Commits per committer: 55.667
  • Development Distribution Score (DDS): 0.138
Top Committers
Name Email Commits
Null 4****e@u****m 523
chenxinye c****t@y****m 77
Stefan Güttel g****l@u****m 17

Issues and Pull Requests

Last synced: about 2 years ago

All Time
  • Total issues: 13
  • Total pull requests: 7
  • Average time to close issues: about 2 months
  • Average time to close pull requests: about 8 hours
  • Total issue authors: 7
  • Total pull request authors: 2
  • Average comments per issue: 2.23
  • Average comments per pull request: 0.43
  • Merged pull requests: 6
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 5
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 12 hours
  • Issue authors: 1
  • Pull request authors: 2
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.6
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • guettel (3)
  • mikecroucher (2)
  • Emmanuel-Mekonnen (1)
  • chenxinye (1)
  • MotorZ (1)
  • joshdunnlime (1)
  • Schwaggot (1)
Pull Request Authors
  • chenxinye (6)
  • kianmeng (1)
Top Labels
Issue Labels
enhancement (2)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 113 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 132
  • Total maintainers: 1
pypi.org: classixclustering

Fast and explainable clustering based on sorting

  • Versions: 121
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 113 Last month
Rankings
Downloads: 6.1%
Stargazers count: 7.8%
Dependent packages count: 10.1%
Average: 11.8%
Forks count: 13.3%
Dependent repos count: 21.6%
Maintainers (1)
Last synced: 6 months ago
conda-forge.org: classixclustering
  • Versions: 11
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Stargazers count: 34.0%
Dependent repos count: 34.0%
Average: 42.7%
Dependent packages count: 51.2%
Forks count: 51.6%
Last synced: 6 months ago

Dependencies

docker/requirements.txt pypi
  • classixclustering >=0.5.8
  • matplotlib *
  • numpy *
  • pandas *
  • requests *
  • scipy ==1.7.3
egg-info/requires.txt pypi
  • matplotlib *
  • numpy >=1.3.0
  • pandas *
  • requests *
  • scipy >=0.7.0
requirements.txt pypi
  • cython *
  • matplotlib *
  • numpy >=1.3.0
  • pandas *
  • requests *
  • scipy >=0.7.0
.github/workflows/codecov.yml actions
  • actions/checkout master composite
  • actions/setup-python master composite
  • codecov/codecov-action v2 composite
.github/workflows/package_release.yml actions
  • actions/checkout v1 composite
  • actions/setup-python v1 composite
docker/Dockerfile docker
  • xnla/ubuntu py build