nacvi

https://github.com/leaeb/noise-aware-cvi

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.5%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: leaeb
License: gpl-3.0
Language: Python
Default Branch: main
Size: 133 KB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created 8 months ago · Last pushed 8 months ago

Metadata Files

Readme License Citation

Noise-Aware Cluster Validity Indices (NACVI)

This repository contains a Python implementation of internal cluster validity indices specifically designed for noise-aware Clusterings (e.g. DBSCAN). The validity indices presented here explicitly consider unassigned data points (noise), which makes them particularly suitable for realistic, unsupervised settings.

This is based on the scientific publication:
Lea Eileen Brauner, Frank Höppner, Frank Klawonn
Cluster Validity for Noise-Aware Clusterings, Intelligent Data Analysis Journal, IOS Press (2025)

Content

You can find the implementations of the following NACVIs:

sil+: noise-aware Silhouette Coefficient
dbi+: noise-aware Davies-Bouldin Index
gD33+: noise-aware Dunn-Index-Variant
sf+: noise-aware Score Function
grid+: grid-based noise-validity index
nr+: neighbourhood-based Noise-validity index

Motivation

Conventional validity measures treat all data points as belonging to a cluster, even if noise is explicitly labelled in DBSCAN, for example. This leads to distorted evaluations.

This package: - takes noise into account correctly, - enables a separate evaluation of the cluster structure and the noise delimitation, - offers an integrated metric for both with the B+ score.

Installation

bash pip install nacvi

Usage

In examples/usage_miniexample.py you can find a minimal example for the usage with numpy arrays as inputs.

In examples/usage_example.ipynb you can find a comprehensive example with: - data generation, - execution of the DBSCAN clustering algorithm, - visualisation, - calculation of the NACVIs

Owner

Name: Lea Eileen Brauner
Login: leaeb
Kind: user
Location: Braunschweig

Repositories: 2
Profile: https://github.com/leaeb

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite the following paper."
authors:
  - family-names: Brauner
    given-names: Lea Eileen
    affiliation: Ostfalia University of Applied Sciences
title: "Noise-Aware Cluster Validity Measures"
version: "1.0.0"
doi: 10.3233/IDA-XXXXX  # <- DOI deiner Publikation
date-released: 2025-07-14
url: https://github.com/username/noise-aware-cluster-validity

GitHub Events

Total

Push event: 1
Create event: 2

Last Year

Push event: 1
Create event: 2

Packages

Total packages: 1
Total downloads: unknown

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 1
Total maintainers: 1

pypi.org: nacvi

Cluster Validity Indices for Noise-Aware Clusterings (e.g. DBSCAN)

Homepage: https://github.com/leaeb/noise-aware-cvi
Documentation: https://nacvi.readthedocs.io/
License: GPL-3.0-or-later
Latest release: 0.1.0
published 7 months ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 8.7%

Average: 28.9%

Dependent repos count: 49.1%

Maintainers (1)

leaeb

Last synced: 7 months ago

Dependencies

Pipfile pypi

ipykernel * develop
matplotlib *
numpy *
pandas *
scikit-learn *
scipy *

Pipfile.lock pypi

appnope ==0.1.4 develop
asttokens ==3.0.0 develop
comm ==0.2.2 develop
debugpy ==1.8.14 develop
decorator ==5.2.1 develop
executing ==2.2.0 develop
ipykernel ==6.29.5 develop
ipython ==9.4.0 develop
ipython-pygments-lexers ==1.1.1 develop
jedi ==0.19.2 develop
jupyter-client ==8.6.3 develop
jupyter-core ==5.8.1 develop
matplotlib-inline ==0.1.7 develop
nest-asyncio ==1.6.0 develop
packaging ==25.0 develop
parso ==0.8.4 develop
pexpect ==4.9.0 develop
platformdirs ==4.3.8 develop
prompt-toolkit ==3.0.51 develop
psutil ==7.0.0 develop
ptyprocess ==0.7.0 develop
pure-eval ==0.2.3 develop
pygments ==2.19.2 develop
python-dateutil ==2.9.0.post0 develop
pyzmq ==27.0.0 develop
six ==1.17.0 develop
stack-data ==0.6.3 develop
tornado ==6.5.1 develop
traitlets ==5.14.3 develop
wcwidth ==0.2.13 develop
contourpy ==1.3.2
cycler ==0.12.1
fonttools ==4.58.5
joblib ==1.5.1
kiwisolver ==1.4.8
matplotlib ==3.10.3
numpy ==2.3.1
packaging ==25.0
pandas ==2.3.1
pillow ==11.3.0
pyparsing ==3.2.3
python-dateutil ==2.9.0.post0
pytz ==2025.2
scikit-learn ==1.7.0
scipy ==1.16.0
six ==1.17.0
threadpoolctl ==3.6.0
tzdata ==2025.2

nacvi.egg-info/requires.txt pypi

contourpy ==1.3.2
cycler ==0.12.1
fonttools ==4.58.5
joblib ==1.5.1
kiwisolver ==1.4.8
numpy ==2.3.1
packaging ==25.0
pandas ==2.3.1
pillow ==11.3.0
pyparsing ==3.2.3
python-dateutil ==2.9.0.post0
pytz ==2025.2
scikit-learn ==1.7.0
scipy ==1.16.0
six ==1.17.0
threadpoolctl ==3.6.0
tzdata ==2025.2

pyproject.toml pypi

requirements.txt pypi

contourpy ==1.3.2
cycler ==0.12.1
fonttools ==4.58.5
joblib ==1.5.1
kiwisolver ==1.4.8
numpy ==2.3.1
packaging ==25.0
pandas ==2.3.1
pillow ==11.3.0
pyparsing ==3.2.3
python-dateutil ==2.9.0.post0
pytz ==2025.2
scikit-learn ==1.7.0
scipy ==1.16.0
six ==1.17.0
threadpoolctl ==3.6.0
tzdata ==2025.2

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science