s_dbw

S_Dbw validity index. Adapted for DBSCAN (and similar)

https://github.com/alashkov83/s_dbw

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.1%) to scientific vocabulary

Keywords

cluster-analysis clustering-evaluation python python3
Last synced: 6 months ago · JSON representation

Repository

S_Dbw validity index. Adapted for DBSCAN (and similar)

Basic Info
Statistics
  • Stars: 10
  • Watchers: 0
  • Forks: 5
  • Open Issues: 1
  • Releases: 0
Topics
cluster-analysis clustering-evaluation python python3
Created over 7 years ago · Last pushed almost 7 years ago
Metadata Files
Readme License Citation

README.md

S_Dbw

Compute the S_Dbw or SD validity index

S_Dbw validity index is defined by equation:

SDbw = Scatt + Densbw

where Scatt - means average scattering for clusters and Dens_bw - inter-cluster density.
Lower value -> better clustering.

SD validity index is defined by equation:

SD = k*Scatt + distance

where distance - distances between cluster centers, k - weighting coefficient equal to distance(Cmax).
Lower value -> better clustering.

Installation

shell pip install --upgrade s-dbw

Usage

```python from sdbw import SDbw score = SDbw(X, labels, centersid=None, method='Tong', algnoise='bind', centr='mean', nearestcentr=True, metric='euclidean')

```

OR

```python from sdbw import SD score = SD(X, labels, k=1.0, centersid=None, algnoise='bind',centr='mean', nearestcentr=True, metric='euclidean')

```

Parameters:

  • X : array-like, shape (nsamples, nfeatures)
    List of n_features-dimensional data points. Each row corresponds to a single data point.
  • labels : array-like, shape (n_samples,)
    Predicted labels for each sample (-1 - for noise).
  • centersid : array-like, shape (nsamples,)
    The center_id of each cluster's center. If None - cluster's center calculate automatically.
  • alg_noise : str,
    Algorithm for recording noise points.
    'comb' - combining all noise points into one cluster (default)
    'sep' - definition of each noise point as a separate cluster
    'bind' - binding of each noise point to the cluster nearest from it
    'filter' - filtering noise points
  • centr : str,
    cluster center calculation method (mean (default) or median)
  • nearest_centr : bool,
    The centroid corresponds to the cluster point closest to the geometric center (default: True).
  • metric : str,
    The distance metric, can be ‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’,
    ‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’,
    ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘wminkowski’,‘yule’.
    Default is ‘euclidean’.
    ##### For S_Dbw:
  • method : str,
    S_Dbw calc method:
    'Halkidi' - original paper [1]
    'Kim' - see [2]
    'Tong' - see [3]
    ##### For SD:
  • k: float, The weighting coefficient equal to distance(Cmax). It is necessary for evaluating solutions with vary number of clusters because distance(C) depends on number of clusters [4].

Returns

score : float
The resulting S_Dbw or SD score.

References:

  1. M. Halkidi and M. Vazirgiannis, “Clustering validity assessment: Finding the optimal partitioning of a data set,” in ICDM, Washington, DC, USA, 2001, pp. 187–194.
  2. Youngok Kim and Soowon Lee. A clustering validity assessment Index. PAKDD’2003, Seoul, Korea, April 30–May 2, 2003, LNAI 2637, 602–608
  3. Tong, J. & Tan, H. J. Electron.(China) (2009) 26: 258. https://doi.org/10.1007/s11767-007-0151-8
  4. Halkidi, Maria & Vazirgiannis, Michalis & Batistakis, Yannis. (2000). Quality Scheme Assessment in the Clustering Process. LNCS (LNAI). 1910. 265-276. 10.1007/3-540-45372-5_26.

Owner

  • Name: Alexander Lashkov
  • Login: alashkov83
  • Kind: user
  • Location: Moscow, Russia
  • Company: FSRC "Crystallography and Photonics" RAS

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 20
  • Total Committers: 2
  • Avg Commits per committer: 10.0
  • Development Distribution Score (DDS): 0.15
Top Committers
Name Email Commits
alashkov83 a****3@g****m 17
Sukrit Gupta i****t@u****m 3

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 5
  • Total pull requests: 4
  • Average time to close issues: 9 days
  • Average time to close pull requests: about 23 hours
  • Total issue authors: 4
  • Total pull request authors: 1
  • Average comments per issue: 0.6
  • Average comments per pull request: 0.5
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • isukrit (2)
  • OdedMous (1)
  • duichwer (1)
  • TDharm (1)
Pull Request Authors
  • isukrit (4)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 669 last-month
  • Total dependent packages: 1
  • Total dependent repositories: 3
  • Total versions: 4
  • Total maintainers: 1
pypi.org: s-dbw

Compute the S_Dbw validity index

  • Versions: 4
  • Dependent Packages: 1
  • Dependent Repositories: 3
  • Downloads: 669 Last month
Rankings
Dependent repos count: 9.0%
Dependent packages count: 10.1%
Average: 12.9%
Downloads: 13.4%
Forks count: 14.3%
Stargazers count: 17.8%
Maintainers (1)
Last synced: 6 months ago

Dependencies

requirements.txt pypi
  • numpy >=1.14.2
  • scipy *