Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.5%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: leaeb
- License: gpl-3.0
- Language: Python
- Default Branch: main
- Size: 133 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Noise-Aware Cluster Validity Indices (NACVI)
This repository contains a Python implementation of internal cluster validity indices specifically designed for noise-aware Clusterings (e.g. DBSCAN). The validity indices presented here explicitly consider unassigned data points (noise), which makes them particularly suitable for realistic, unsupervised settings.
This is based on the scientific publication:
Lea Eileen Brauner, Frank Höppner, Frank Klawonn
Cluster Validity for Noise-Aware Clusterings, Intelligent Data Analysis Journal, IOS Press (2025)
Content
You can find the implementations of the following NACVIs:
sil+: noise-aware Silhouette Coefficientdbi+: noise-aware Davies-Bouldin IndexgD33+: noise-aware Dunn-Index-Variantsf+: noise-aware Score Functiongrid+: grid-based noise-validity indexnr+: neighbourhood-based Noise-validity index
Motivation
Conventional validity measures treat all data points as belonging to a cluster, even if noise is explicitly labelled in DBSCAN, for example. This leads to distorted evaluations.
This package:
- takes noise into account correctly,
- enables a separate evaluation of the cluster structure and the noise delimitation,
- offers an integrated metric for both with the B+ score.
Installation
bash
pip install nacvi
Usage
In examples/usage_miniexample.py you can find a minimal example for the usage with numpy arrays as inputs.
In examples/usage_example.ipynb you can find a comprehensive example with: - data generation, - execution of the DBSCAN clustering algorithm, - visualisation, - calculation of the NACVIs
Owner
- Name: Lea Eileen Brauner
- Login: leaeb
- Kind: user
- Location: Braunschweig
- Repositories: 2
- Profile: https://github.com/leaeb
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite the following paper."
authors:
- family-names: Brauner
given-names: Lea Eileen
affiliation: Ostfalia University of Applied Sciences
title: "Noise-Aware Cluster Validity Measures"
version: "1.0.0"
doi: 10.3233/IDA-XXXXX # <- DOI deiner Publikation
date-released: 2025-07-14
url: https://github.com/username/noise-aware-cluster-validity
GitHub Events
Total
- Push event: 1
- Create event: 2
Last Year
- Push event: 1
- Create event: 2
Packages
- Total packages: 1
- Total downloads: unknown
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 1
- Total maintainers: 1
pypi.org: nacvi
Cluster Validity Indices for Noise-Aware Clusterings (e.g. DBSCAN)
- Homepage: https://github.com/leaeb/noise-aware-cvi
- Documentation: https://nacvi.readthedocs.io/
- License: GPL-3.0-or-later
-
Latest release: 0.1.0
published 7 months ago
Rankings
Maintainers (1)
Dependencies
- ipykernel * develop
- matplotlib *
- numpy *
- pandas *
- scikit-learn *
- scipy *
- appnope ==0.1.4 develop
- asttokens ==3.0.0 develop
- comm ==0.2.2 develop
- debugpy ==1.8.14 develop
- decorator ==5.2.1 develop
- executing ==2.2.0 develop
- ipykernel ==6.29.5 develop
- ipython ==9.4.0 develop
- ipython-pygments-lexers ==1.1.1 develop
- jedi ==0.19.2 develop
- jupyter-client ==8.6.3 develop
- jupyter-core ==5.8.1 develop
- matplotlib-inline ==0.1.7 develop
- nest-asyncio ==1.6.0 develop
- packaging ==25.0 develop
- parso ==0.8.4 develop
- pexpect ==4.9.0 develop
- platformdirs ==4.3.8 develop
- prompt-toolkit ==3.0.51 develop
- psutil ==7.0.0 develop
- ptyprocess ==0.7.0 develop
- pure-eval ==0.2.3 develop
- pygments ==2.19.2 develop
- python-dateutil ==2.9.0.post0 develop
- pyzmq ==27.0.0 develop
- six ==1.17.0 develop
- stack-data ==0.6.3 develop
- tornado ==6.5.1 develop
- traitlets ==5.14.3 develop
- wcwidth ==0.2.13 develop
- contourpy ==1.3.2
- cycler ==0.12.1
- fonttools ==4.58.5
- joblib ==1.5.1
- kiwisolver ==1.4.8
- matplotlib ==3.10.3
- numpy ==2.3.1
- packaging ==25.0
- pandas ==2.3.1
- pillow ==11.3.0
- pyparsing ==3.2.3
- python-dateutil ==2.9.0.post0
- pytz ==2025.2
- scikit-learn ==1.7.0
- scipy ==1.16.0
- six ==1.17.0
- threadpoolctl ==3.6.0
- tzdata ==2025.2
- contourpy ==1.3.2
- cycler ==0.12.1
- fonttools ==4.58.5
- joblib ==1.5.1
- kiwisolver ==1.4.8
- numpy ==2.3.1
- packaging ==25.0
- pandas ==2.3.1
- pillow ==11.3.0
- pyparsing ==3.2.3
- python-dateutil ==2.9.0.post0
- pytz ==2025.2
- scikit-learn ==1.7.0
- scipy ==1.16.0
- six ==1.17.0
- threadpoolctl ==3.6.0
- tzdata ==2025.2
- contourpy ==1.3.2
- cycler ==0.12.1
- fonttools ==4.58.5
- joblib ==1.5.1
- kiwisolver ==1.4.8
- numpy ==2.3.1
- packaging ==25.0
- pandas ==2.3.1
- pillow ==11.3.0
- pyparsing ==3.2.3
- python-dateutil ==2.9.0.post0
- pytz ==2025.2
- scikit-learn ==1.7.0
- scipy ==1.16.0
- six ==1.17.0
- threadpoolctl ==3.6.0
- tzdata ==2025.2