leiden-communities-openmp-heuristic-dynamic
Design of OpenMP-based Dynamic Leiden algorithm for community detection, with speedy heuristics.
https://github.com/puzzlef/leiden-communities-openmp-heuristic-dynamic
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, nature.com, ieee.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.8%) to scientific vocabulary
Repository
Design of OpenMP-based Dynamic Leiden algorithm for community detection, with speedy heuristics.
Basic Info
- Host: GitHub
- Owner: puzzlef
- License: mit
- Language: C++
- Default Branch: main
- Homepage: https://arxiv.org/abs/2410.15451
- Size: 97.7 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Design of OpenMP-based Dynamic Leiden algorithm for community detection, with speedy heuristics.
Community detection, or clustering, identifies groups of nodes in a graph that are more densely connected to each other than to the rest of the network. Given the size and dynamic nature of real-world graphs, efficient community detection is crucial for tracking evolving communities, enhancing our understanding and management of complex systems. The Leiden algorithm, which improves upon the Louvain algorithm, efficiently detects communities in large networks, producing high-quality structures. However, existing multicore dynamic community detection algorithms based on Leiden are inefficient and lack support for tracking evolving communities. This technical report introduces the first implementations of parallel Naive-dynamic (ND), Delta-screening (DS), and Dynamic Frontier (DF) Leiden algorithms that efficiently track communities over time. Experiments on a 64-core AMD EPYC-7742 processor demonstrate that ND, DS, and DF Leiden achieve average speedups of 3.9x, 4.4x, and 6.1x, respectively, on large graphs with random batch updates compared to the Static Leiden algorithm, and these approaches scale at 1.4 - 1.5x for every thread doubling.
Below we plot the time taken by our improved Naive-dynamic (ND), Delta-screening (DS), and Dynamic Frontier (DF) Leiden, against Static Leiden on 12 large graphs with random batch updates of size 10^-7|E| to 0.1|E|, consisting of 80% edge insertions and 20% deletions. We observe that ND, DS, and DF Leiden achieve mean speedups of 3.9×,4.4×, and 6.1×, respectively, when compared to Static Leiden. This speedup is higher on smaller batch updates, with ND, DS, and DF Leiden being on average 4.7×, 7.1×, and 11.6×, respectively, when compared to Static Leiden, on batch updates of size 10^−7|E|.
Next, we plot the modularity of communities identified by our improved ND, DS, and DF Leiden, compared to Static Leiden. On average, the modularity of communities from Static Leiden is slightly lower. However, we notice that modularity of communities identified by our algorithms do not differ by more than 0.002 from Static Leiden, on average.
Refer to our technical report for more details: \ Heuristic-based Dynamic Leiden Algorithm for Efficient Tracking of Communities on Evolving Graphs.
[!NOTE] You can just copy
main.shto your system and run it. \ For the code, refer tomain.cxx.
Code structure
The code structure is as follows:
bash
- inc/_algorithm.hxx: Algorithm utility functions
- inc/_bitset.hxx: Bitset manipulation functions
- inc/_cmath.hxx: Math functions
- inc/_ctypes.hxx: Data type utility functions
- inc/_cuda.hxx: CUDA utility functions
- inc/_debug.hxx: Debugging macros (LOG, ASSERT, ...)
- inc/_iostream.hxx: Input/output stream functions
- inc/_iterator.hxx: Iterator utility functions
- inc/_main.hxx: Main program header
- inc/_mpi.hxx: MPI (Message Passing Interface) utility functions
- inc/_openmp.hxx: OpenMP utility functions
- inc/_queue.hxx: Queue utility functions
- inc/_random.hxx: Random number generation functions
- inc/_string.hxx: String utility functions
- inc/_utility.hxx: Runtime measurement functions
- inc/_vector.hxx: Vector utility functions
- inc/batch.hxx: Batch update generation functions
- inc/bfs.hxx: Breadth-first search algorithms
- inc/csr.hxx: Compressed Sparse Row (CSR) data structure functions
- inc/dfs.hxx: Depth-first search algorithms
- inc/duplicate.hxx: Graph duplicating functions
- inc/Graph.hxx: Graph data structure functions
- inc/leiden.hxx: Leiden algorithm functions
- inc/louvian.hxx: Louvian algorithm functions
- inc/main.hxx: Main header
- inc/mtx.hxx: Graph file reading functions
- inc/properties.hxx: Graph Property functions
- inc/selfLoop.hxx: Graph Self-looping functions
- inc/snap.hxx: SNAP dataset reading functions
- inc/symmetricize.hxx: Graph Symmetricization functions
- inc/transpose.hxx: Graph transpose functions
- inc/update.hxx: Update functions
- main.cxx: Experimentation code
- process.js: Node.js script for processing output logs
Note that each branch in this repository contains code for a specific experiment. The main branch contains code for the final experiment. If the intention of a branch in unclear, or if you have comments on our technical report, feel free to open an issue.
References
- Fast unfolding of communities in large networks; Vincent D. Blondel et al. (2008)
- Community Detection on the GPU; Md. Naim et al. (2017)
- Scalable Static and Dynamic Community Detection Using Grappolo; Mahantesh Halappanavar et al. (2017)
- From Louvain to Leiden: guaranteeing well-connected communities; V.A. Traag et al. (2019)
- CS224W: Machine Learning with Graphs | Louvain Algorithm; Jure Leskovec (2021)
- The University of Florida Sparse Matrix Collection; Timothy A. Davis et al. (2011)
- Fetch-and-add using OpenMP atomic operations
Owner
- Name: puzzlef
- Login: puzzlef
- Kind: organization
- Website: https://puzzlef.github.io/
- Repositories: 10
- Profile: https://github.com/puzzlef
A summary of experiments.
Citation (CITATION.bib)
@article{sahu2024starting,
title={Heuristic-based Dynamic Leiden Algorithm for Efficient Tracking of Communities on Evolving Graphs},
author={Sahu, Subhajit},
journal={arXiv preprint arXiv:2410.15451},
year={2024}
}
GitHub Events
Total
- Issues event: 1
- Watch event: 1
- Push event: 1
- Public event: 1
Last Year
- Issues event: 1
- Watch event: 1
- Push event: 1
- Public event: 1
Committers
Last synced: 10 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Subhajit Sahu | w****7@g****m | 19 |
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: 11 days
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 2.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: 11 days
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 2.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- wolfram77 (1)


