louvain-split-communities-openmp
Design of OpenMP-based Parallel Louvain algorithm for community detection, with no internally-disconnected communities.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, nature.com, ieee.org, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.7%) to scientific vocabulary
Repository
Design of OpenMP-based Parallel Louvain algorithm for community detection, with no internally-disconnected communities.
Basic Info
- Host: GitHub
- Owner: puzzlef
- License: mit
- Language: C++
- Default Branch: main
- Homepage: https://arxiv.org/abs/2402.11454
- Size: 197 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Design of OpenMP-based Parallel Louvain algorithm for community detection, \ with no internally-disconnected communities.
Community detection entails the identification of clusters of vertices that exhibit stronger connections within themselves compared to the wider network. The Louvain method, a commonly utilized heuristic for this task, employs a two-step process comprising a local-moving phase and an aggregation phase. This process iteratively optimizes the modularity metric, a measure of community quality. Although widely used, the Louvain method has been noted for generating internally-disconnected and poorly connected communities. In response, Traag et al. introduce the Leiden algorithm, which incorporates an extra refinement phase between the local-moving and aggregation phases. This refinement step enables vertices to explore and potentially create sub-communities within the identified communities from the local-moving phase. In this report, we propose another BFS-based approach, GSP-Louvain, for addressing the same issue. This is different from a number of earlier works, which tackled internally-disconnected communities as a post-processing step.
Below we plot the time taken by the original Leiden, igraph Leiden, NetworKit Leiden, GSP-Louvain on 13 different graphs. GSP-Louvain surpasses the original Leiden, igraph Leiden, and NetworKit Leiden by 341×, 83×, and 6.1× respectively, achieving a processing rate of 328M edges/s on a 3.8𝐵 edge graph.
Below we plot the speedup of GSP-Louvain wrt original Leiden, igraph Leiden, and NetworKit Leiden.
Next, we compare the modularity of communities identified by the original Leiden algorithm, igraph Leiden, NetworKit Leiden, and GSP-Louvain. On average, GSP-Louvain achieves 0.3% lower modularity than the original Leiden and igraph Leiden, respectively, and 25% higher modularity than NetworKit Leiden, particularly evident on road networks and protein k-mer graphs.
Finally, we plot the fraction of disconnected communities identified by each implementation. Absence of bars indicates the absence of disconnected communities. The original Leiden, igraph Leiden, and GSP-Louvain do not identify any communities that are internally-disconnected. However, on average, NetworKit Leiden exhibit fraction of disconnected communities amounting to 1.5×10^−2, particularly on web graphs and social networks.
Refer to our technical reports for more details: \ An Approach for Addressing Internally-Disconnected Communities in Louvain Algorithm.
[!NOTE] You can just copy
main.shto your system and run it. \ For the code, refer tomain.cxx.
Code structure
The code structure of GSP-Louvain is as follows:
bash
- inc/_algorithm.hxx: Algorithm utility functions
- inc/_bitset.hxx: Bitset manipulation functions
- inc/_cmath.hxx: Math functions
- inc/_ctypes.hxx: Data type utility functions
- inc/_cuda.hxx: CUDA utility functions
- inc/_debug.hxx: Debugging macros (LOG, ASSERT, ...)
- inc/_iostream.hxx: Input/output stream functions
- inc/_iterator.hxx: Iterator utility functions
- inc/_main.hxx: Main program header
- inc/_mpi.hxx: MPI (Message Passing Interface) utility functions
- inc/_openmp.hxx: OpenMP utility functions
- inc/_queue.hxx: Queue utility functions
- inc/_random.hxx: Random number generation functions
- inc/_string.hxx: String utility functions
- inc/_utility.hxx: Runtime measurement functions
- inc/_vector.hxx: Vector utility functions
- inc/batch.hxx: Batch update generation functions
- inc/bfs.hxx: Breadth-first search algorithms
- inc/csr.hxx: Compressed Sparse Row (CSR) data structure functions
- inc/dfs.hxx: Depth-first search algorithms
- inc/duplicate.hxx: Graph duplicating functions
- inc/Graph.hxx: Graph data structure functions
- inc/louvian.hxx: Louvian algorithm functions
- inc/louvainSplit.hxx: Louvain with no disconnected communities
- inc/main.hxx: Main header
- inc/mtx.hxx: Graph file reading functions
- inc/properties.hxx: Graph Property functions
- inc/selfLoop.hxx: Graph Self-looping functions
- inc/symmetricize.hxx: Graph Symmetricization functions
- inc/transpose.hxx: Graph transpose functions
- inc/update.hxx: Update functions
- main.cxx: Experimentation code
- process.js: Node.js script for processing output logs
Note that each branch in this repository contains code for a specific experiment. The main branch contains code for the final experiment. If the intention of a branch in unclear, or if you have comments on our technical report, feel free to open an issue.
References
- Fast unfolding of communities in large networks; Vincent D. Blondel et al. (2008)
- Community Detection on the GPU; Md. Naim et al. (2017)
- Scalable Static and Dynamic Community Detection Using Grappolo; Mahantesh Halappanavar et al. (2017)
- From Louvain to Leiden: guaranteeing well-connected communities; V.A. Traag et al. (2019)
- CS224W: Machine Learning with Graphs | Louvain Algorithm; Jure Leskovec (2021)
- The University of Florida Sparse Matrix Collection; Timothy A. Davis et al. (2011)
- Fetch-and-add using OpenMP atomic operations
Owner
- Name: puzzlef
- Login: puzzlef
- Kind: organization
- Website: https://puzzlef.github.io/
- Repositories: 10
- Profile: https://github.com/puzzlef
A summary of experiments.
Citation (CITATION.bib)
@article{sahu2024approach,
title={An Approach for Addressing Internally-Disconnected Communities in Louvain Algorithm},
author={Sahu, Subhajit},
journal={arXiv preprint arXiv:2402.11454},
year={2024}
}
GitHub Events
Total
- Push event: 2
Last Year
- Push event: 2
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0




