radius_clustering

Source code repository of the Radius clustering python package.

https://github.com/scikit-learn-contrib/radius_clustering

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README
  • Academic publication links
    Links to: sciencedirect.com, springer.com, acm.org, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.8%) to scientific vocabulary

Keywords

clustering minimum-dominating-set radius-constraint
Last synced: 6 months ago · JSON representation ·

Repository

Source code repository of the Radius clustering python package.

Basic Info
Statistics
  • Stars: 7
  • Watchers: 2
  • Forks: 0
  • Open Issues: 3
  • Releases: 2
Topics
clustering minimum-dominating-set radius-constraint
Created over 1 year ago · Last pushed 8 months ago
Metadata Files
Readme Changelog License Citation

README.md

License: GPLv3 PyPI Code style: Ruff GitHub Actions Workflow Status Python version supported Codecov Binder DOI

Radius Clustering

Radius clustering is a Python package that implements clustering under radius constraint based on the Minimum Dominating Set (MDS) problem. This problem is NP-Hard but has been studied in the literature and proven to be linked to the clustering under radius constraint problem (see references for more details).

Features

  • Implements both exact and approximate MDS-based clustering algorithms
  • Compatible with scikit-learn's API for clustering algorithms
  • Supports radius-constrained clustering
  • Provides options for exact and approximate solutions
  • Easy to use and integrate with existing Python data science workflows
  • Includes comprehensive documentation and examples
  • Full test coverage to ensure reliability and correctness
  • Supports custom MDS solvers for flexibility in clustering approaches
  • Provides a user-friendly interface for clustering tasks

[!CAUTION] Deprecation Notice: The threshold parameter in the RadiusClustering class has been deprecated. Please use the radius parameter instead for specifying the radius for clustering. It is planned to be completely removed in version 2.0.0. The radius parameter is now the standard way to define the radius for clustering, aligning with our objective of making the parameters' name more intuitive and user-friendly.

[!NOTE] NEW VERSIONS: The package is currently under active development for new features and improvements, including some refactoring and enhancements to the existing codebase. Backwards compatibility is not guaranteed, so please check the CHANGELOG for details on changes and updates.

Roadmap

  • [x] Version 1.4.0:
    • [x] Add support for custom MDS solvers
    • [x] Improve documentation and examples
    • [x] Add more examples and tutorials

Installation

You can install Radius Clustering using pip:

bash pip install radius-clustering

Usage

Here's a basic example of how to use Radius Clustering:

```python import numpy as np from radius_clustering import RadiusClustering

Example usage

X = np.random.rand(100, 2) # Generate random data

Create an instance of MdsClustering

rad_clustering = RadiusClustering(manner="approx", radius=0.5)

Fit the model to the data

rad_clustering.fit(X)

Get cluster labels

labels = radclustering.labels

print(labels) ```

Documentation

You can find the full documentation for Radius Clustering here.

Building the documentation

To build the documentation, you can run the following command, assuming you have all dependencies needed installed:

bash cd docs make html

Then you can open the index.html file in the build directory to view the full documentation.

More information

For more information please refer to the official documentation.

If you want insights on how the algorithm works, please refer to the presentation.

If you want to know more about the experiments conducted with the package, please refer to the experiments.

Contributing

Contributions to Radius Clustering are welcome!

Please read the CONTRIBUTING.md file for details on how to contribute to the project. Please note that the project is released with a Code of Conduct, and we expect all contributors to adhere to it.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

How to cite this work

If you use Radius Clustering in your research, please cite the following paper and the software itself:

bibtex @inproceedings{haenn_clustering2024, TITLE = {{Clustering Under Radius Constraints Using Minimum Dominating Sets}}, AUTHOR = {Haenn, Quentin and Chardin, Brice and Baron, Micka{\"e}l}, URL = {https://hal.science/hal-04533921}, BOOKTITLE = {{Lecture Notes in Artificial Intelligence}}, ADDRESS = {Poitiers, France}, PUBLISHER = {{Springer}}, YEAR = {2024}, MONTH = Jun, KEYWORDS = {Constrained Clustering ; Radius Based Clustering ; Minimum Dominating Set ; Constrained Clustering Radius Based Clustering Minimum Dominating Set}, PDF = {https://hal.science/hal-04533921v1/file/clustering_under_radius_using_mds.pdf}, HAL_ID = {hal-04533921}, HAL_VERSION = {v1}, }

Acknowledgments

MDS Algorithms

The two MDS algorithms implemented are forked and modified (or rewritten) from the following authors:

  • Alejandra Casado for the minimum dominating set heuristic code [1]. We rewrote the code in C++ to adapt to the need of python interfacing.
  • Hua Jiang for the minimum dominating set exact algorithm code [2]. The code has been adapted to the need of python interfacing.

Funders

The Radius Clustering work has been funded by:

Contributors

References

Owner

  • Name: scikit-learn-contrib
  • Login: scikit-learn-contrib
  • Kind: organization

scikit-learn compatible projects

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Radius Clustering
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Quentin
    family-names: Haenn
    email: quentin.haenn@ensma.fr
    affiliation: LIAS Lab
    orcid: 'https://orcid.org/0009-0009-1663-0107'
  - given-names: Brice
    family-names: Chardin
    email: brice.chardin@ensma.fr
    affiliation: LIAS Lab
    orcid: 'https://orcid.org/0000-0002-9298-9447'
  - given-names: Mickael
    family-names: Baron
    email: mickael.baron@ensma.fr
    affiliation: LIAS Lab
    orcid: 'https://orcid.org/0000-0002-3356-0835'
  - name: LIAS Laboratory
    address: 1 Avenue Clément Ader
    city: Chasseneuil du Poitou
    post-code: '86360'
    website: 'https://www.lias-lab.fr'
identifiers:
  - type: swh
    value: 'swh:1:rev:66f8d295cc5fbc80f356d11be46571bfbb190609'
license: GPL-3.0

GitHub Events

Total
  • Issues event: 13
  • Delete event: 5
  • Issue comment event: 4
  • Push event: 11
  • Pull request event: 9
  • Create event: 4
Last Year
  • Issues event: 13
  • Delete event: 5
  • Issue comment event: 4
  • Push event: 11
  • Pull request event: 9
  • Create event: 4

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 92
  • Total Committers: 2
  • Avg Commits per committer: 46.0
  • Development Distribution Score (DDS): 0.011
Past Year
  • Commits: 92
  • Committers: 2
  • Avg Commits per committer: 46.0
  • Development Distribution Score (DDS): 0.011
Top Committers
Name Email Commits
Quentin q****n@e****r 91
Mickael BARON b****n@e****r 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 15
  • Total pull requests: 37
  • Average time to close issues: 2 days
  • Average time to close pull requests: about 1 hour
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 0.6
  • Average comments per pull request: 0.05
  • Merged pull requests: 36
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 15
  • Pull requests: 37
  • Average time to close issues: 2 days
  • Average time to close pull requests: about 1 hour
  • Issue authors: 2
  • Pull request authors: 2
  • Average comments per issue: 0.6
  • Average comments per pull request: 0.05
  • Merged pull requests: 36
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • quentinhaenn (13)
  • bchardin (1)
Pull Request Authors
  • quentinhaenn (35)
  • mickaelbaron (1)
Top Labels
Issue Labels
documentation (5) enhancement (4) Priority : Mid (3) Priority: HIGH (3) bug (2) Priority : low (2) triage (1)
Pull Request Labels
documentation (1)