rak-communities-cuda
Design of CUDA-based Label Propagation Algorithm (LPA), aka RAK, for community detection.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.9%) to scientific vocabulary
Repository
Design of CUDA-based Label Propagation Algorithm (LPA), aka RAK, for community detection.
Basic Info
- Host: GitHub
- Owner: puzzlef
- License: mit
- Language: C++
- Default Branch: main
- Homepage: http://arxiv.org/abs/2411.11468
- Size: 247 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Design of CUDA-based Label Propagation Algorithm (LPA), aka RAK, for community detection.
Community detection is the problem of identifying natural divisions in networks. Efficient parallel algorithms for identifying such divisions are critical in a number of applications. This report presents an optimized implementation of the Label Propagation Algorithm (LPA) for community detection, featuring an asynchronous LPA with a Pick-Less (PL) method every 4 iterations to handle community swaps, ideal for SIMT hardware like GPUs. It also introduces a novel per-vertex hashtable with hybrid quadratic-double probing for collision resolution. On an NVIDIA A100 GPU, our implementation, ν-LPA, outperforms FLPA, NetworKit LPA, and GVE-LPA by 364x, 62x, and 2.6x, respectively, on a server with dual 16-core Intel Xeon Gold 6226R processors - processing 3.0B edges/s on a 2.2B edge graph - and achieves 4.7% higher modularity than FLPA, but 6.1% and 2.2% lower than NetworKit LPA and GVE-LPA.
Below we plot the time taken by FLPA, NetworKit LPA, GVE-LPA, and ν-LPA on 13 different graphs. ν-LPA surpasses FLPA, NetworKit LPA, and GVE-LPA by 364, 62×, and 2.6× respectively, achieving a processing rate of 3.0B edges/s on a 2.2𝐵 edge graph.
Below we plot the speedup of ν-LPA wrt FLPA, NetworKit LPA, and GVE-LPA.
Next, we plot the modularity of communities identified by FLPA, NetworKit LPA, GVE-LPA, and ν-LPA. ν-LPA on average obtains 4.7%higher modularity than FLPA, but 6.1% / 2.2% lower modularity than NetworKit LPA / GVE-LPA. We recommend employing 𝜈-LPA on web graphs and social networks. For road networks, however, GVE-LPA appears to be the most effective, while NetworKit LPA is recommended for protein k-mer graphs.
Refer to our technical report for more details: \ ν-LPA: Fast GPU-based Label Propagation Algorithm (LPA) for Community Detection.
[!NOTE] You can just copy
main.shto your system and run it. \ For the code, refer tomain.cxx.
Code structure
The code structure of ν-LPA is as follows:
bash
- inc/_algorithm.hxx: Algorithm utility functions
- inc/_bitset.hxx: Bitset manipulation functions
- inc/_cmath.hxx: Math functions
- inc/_ctypes.hxx: Data type utility functions
- inc/_cuda.hxx: CUDA utility functions
- inc/_debug.hxx: Debugging macros (LOG, ASSERT, ...)
- inc/_iostream.hxx: Input/output stream functions
- inc/_iterator.hxx: Iterator utility functions
- inc/_main.hxx: Main program header
- inc/_mpi.hxx: MPI (Message Passing Interface) utility functions
- inc/_openmp.hxx: OpenMP utility functions
- inc/_queue.hxx: Queue utility functions
- inc/_random.hxx: Random number generation functions
- inc/_string.hxx: String utility functions
- inc/_utility.hxx: Runtime measurement functions
- inc/_vector.hxx: Vector utility functions
- inc/batch.hxx: Batch update generation functions
- inc/bfs.hxx: Breadth-first search algorithms
- inc/csr.hxx: Compressed Sparse Row (CSR) data structure functions
- inc/dfs.hxx: Depth-first search algorithms
- inc/duplicate.hxx: Graph duplicating functions
- inc/Graph.hxx: Graph data structure functions
- inc/rak.hxx: LPA/RAK community detection algorithm functions
- inc/rakCuda.hxx: CUDA implementation of LPA (ν-LPA)
- inc/hashtableCuda.hxx: Open addressing hashtable functions, with quadratic-double probing
- inc/split.hxx: Algorithms to split internally-disconnected communities
- inc/main.hxx: Main header
- inc/mtx.hxx: Graph file reading functions
- inc/properties.hxx: Graph Property functions
- inc/selfLoop.hxx: Graph Self-looping functions
- inc/symmetricize.hxx: Graph Symmetricization functions
- inc/transpose.hxx: Graph transpose functions
- inc/update.hxx: Update functions
- main.cxx: Experimentation code
- process.js: Node.js script for processing output logs
Note that each branch in this repository contains code for a specific experiment. The main branch contains code for the final experiment. If the intention of a branch in unclear, or if you have comments on our technical report, feel free to open an issue.
References
- Near linear time algorithm to detect community structures in large-scale networks; Raghavan et al. (2007)
- The University of Florida Sparse Matrix Collection; Davis et al. (2011)
- How to import VSCode keybindings into Visual Studio?
- Configure X11 Forwarding with PuTTY and Xming
- Installing snap on CentOS
Owner
- Name: puzzlef
- Login: puzzlef
- Kind: organization
- Website: https://puzzlef.github.io/
- Repositories: 10
- Profile: https://github.com/puzzlef
A summary of experiments.
Citation (CITATION.bib)
@article{sahu2024nulpa,
title={$\nu$-LPA: Fast GPU-based Label Propagation Algorithm (LPA) for Community Detection},
author={Sahu, Subhajit},
journal={arXiv preprint arXiv:2411.11468},
year={2024}
}
GitHub Events
Total
- Push event: 1
- Public event: 1
Last Year
- Push event: 1
- Public event: 1
Committers
Last synced: 10 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Subhajit Sahu | w****7@g****m | 10 |
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: 6 days
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: 6 days
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- wolfram77 (1)



