Local clustering

Local clustering - Published in JOSS (2018)

https://github.com/volfpeter/localclustering

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

cluster cluster-analysis clustering clustering-algorithm graph-algorithms graph-theory hierarchical-clustering local-clustering python python3 ranking social-network-analysis
Last synced: 6 months ago · JSON representation

Repository

Python 3 implementation and documentation of the Hermina-Janos local graph clustering algorithm.

Basic Info
  • Host: GitHub
  • Owner: volfpeter
  • License: agpl-3.0
  • Language: Python
  • Default Branch: master
  • Size: 2.48 MB
Statistics
  • Stars: 23
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 6
Topics
cluster cluster-analysis clustering clustering-algorithm graph-algorithms graph-theory hierarchical-clustering local-clustering python python3 ranking social-network-analysis
Created over 8 years ago · Last pushed about 3 years ago
Metadata Files
Readme License

README.md

DOI DOI Downloads

LocalClustering

The project implements multiple variations of a local graph clustering algorithm named the Hermina-Janos algorithm in memory of my beloved grandparents.

Graph cluster analysis is used in a wide variety of fields. This project does not target one specific field, instead it aims to be a general tool for graph cluster analysis for cases where global cluster analysis is not applicable or practical for example because of the size of the data set or because a different (local) perspective is required.

The algorithms are independent of the cluster definition. The interface cluster definitions must implement can be found in the definitions package along with a simple connectivity based cluster definition implementation. Besides the algorithms and the cluster definition, other utilities are also provided, most notably a module for node ranking.

Installation

Install the latest version of the project from the Python Package Index using pip install localclustering.

Getting started

This section will guide you through the basics using SQLAlchemy and the IGraphWrapper graph implementation from graphscraper. IGraphWrapper requires the igraph project to be installed. You can do this by following the instructions on this page.

Once everything is in place, the analyzed graph can be created:

```Python import igraph from graphscraper.igraphwrapper import IGraphWrapper

graph = IGraphWrapper(igraph.Graph.Famous("Zachary")) ```

The next step is the creation of the cluster definition and the preparation of the clustering algorithm:

```Python from localclustering.definitions.connectivity import ConnectivityClusterDefinition from localclustering.localengine import LocalClusterEngine

clusterdefinition = ConnectivityClusterDefinition(1.5, 0.85) localclusterengine = LocalClusterEngine( clusterdefinition, # The cluster definition the algorithm should use. sourcenodesinresult=True, # Ensure that source nodes are not removed from the cluster. maxcluster_size=34 # Specify an upper limit for the calculated cluster's size. ) ```

Now the source node of the clustering must be retrieved:

Python source_node = graph.nodes.get_node_by_name("2", can_validate_and_load=True)

And finally the cluster analysis can be executed:

Python cluster = local_cluster_engine.cluster([source_node])

Additionally you can list the nodes inside the cluster with their rank to get an overview of the result:

Python rank_provider = local_cluster_engine.get_rank_provider() for node in cluster.nodes: print(node.igraph_index, rank_provider.get_node_rank(node))

Example visualization of the result: the source node is diamond shaped, red nodes are part of the cluster, light blue nodes mark the neighborhood of the cluster, and the size of nodes correspond to their rank.

Additional resources

In addition to the software, a detailed description and an in-depth evaluation of the algorithms is also provided.

Furthermore, a demo module showing the basic usage of the project is also available.

Related projects

You can find related projects here:

Community guidelines

Any form of constructive contribution is welcome:

  • Questions, feedback, bug reports: please open an issue in the issue tracker of the project or contact the repository owner by email, whichever you feel appropriate.
  • Contribution to the software: please open an issue in the issue tracker of the project that describes the changes you would like to make to the software and open a pull request with the changes. The description of the pull request must references the corresponding issue.

The following types of contribution are especially appreciated:

  • Implementation of new cluster definitions.
  • Result comparison with global clustering algorithms on well-known and -analyzed graphs.
  • Analysis of how cluster definitions should be configured for graphs with different characteristics.
  • Analysis of how the weighting coefficients of the connectivity based cluster definition corresponding to the different hierarchy levels relate to each-other in different real-world graphs.

License - GNU AGPLv3

The library is open-sourced under the conditions of the GNU Affero General Public License v3.0, which is the strongest copyleft license. The reason for using this license is that this library is the "publication" of the Hermina-Janos algorithm and it should be referenced accordingly.

Owner

  • Name: Peter Volf
  • Login: volfpeter
  • Kind: user

Python & TypeScript

JOSS Publication

Local clustering
Published
October 03, 2018
Volume 3, Issue 30, Page 960
Authors
Peter Volf ORCID
Independent developer
Editor
Pjotr Prins ORCID
Tags
graph theory clustering algorithm ranking

GitHub Events

Total
  • Watch event: 4
Last Year
  • Watch event: 4

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 47
  • Total Committers: 4
  • Avg Commits per committer: 11.75
  • Development Distribution Score (DDS): 0.255
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Péter Volf p****r@m****m 35
DannyArends D****s@G****m 8
volfpeter d****p@g****m 3
Arfon Smith a****n 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 17 minutes
  • Total issue authors: 0
  • Total pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.5
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • DannyArends (1)
  • arfon (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 43 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 5
  • Total maintainers: 1
pypi.org: localclustering

Python 3 implementation and documentation of the Hermina-Janos local graph clustering algorithm.

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 43 Last month
Rankings
Dependent packages count: 10.0%
Stargazers count: 13.9%
Average: 20.5%
Dependent repos count: 21.8%
Forks count: 22.6%
Downloads: 34.0%
Maintainers (1)
Last synced: 6 months ago

Dependencies

setup.py pypi
  • graphscraper >=0.5