allocator-openmp
Design of multi-threaded memory allocator for supporting the implementation of dynamic graphs.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 5 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, ieee.org, acm.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.5%) to scientific vocabulary
Repository
Design of multi-threaded memory allocator for supporting the implementation of dynamic graphs.
Basic Info
- Host: GitHub
- Owner: puzzlef
- License: mit
- Language: C++
- Default Branch: main
- Homepage: https://arxiv.org/abs/2502.13862
- Size: 11.7 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Design of multi-threaded memory allocator for supporting the implementation of dynamic graphs.
Research in graph-structured data has grown rapidly due to graphs' ability to represent complex real-world information and capture intricate relationships, particularly as many real-world graphs evolve dynamically through edge/vertex insertions and deletions. This has spurred interest in programming frameworks for managing, maintaining, and processing such dynamic graphs. In our report, we evaluate the performance of PetGraph (Rust), Stanford Network Analysis Platform (SNAP), SuiteSparse:GraphBLAS, cuGraph, Aspen, and our custom implementation in tasks including loading graphs from disk to memory, cloning loaded graphs, applying in-place edge deletions/insertions, and performing a simple iterative graph traversal algorithm. Our implementation demonstrates significant performance improvements: it outperforms PetGraph, SNAP, SuiteSparse:GraphBLAS, cuGraph, and Aspen by factors of 177x, 106x, 76x, 17x, and 3.3x in graph loading; 20x, 235x, 0.24x, 1.3x, and 0x in graph cloning; 141x/45x, 44x/25x, 13x/11x, 28x/34x, and 3.5x/2.2x in edge deletions/insertions; and 67x/63x, 86x/86x, 2.5x/2.6x, 0.25x/0.24x, and 1.3x/1.3x in traversal on updated graphs with deletions/insertions.
Below, we plot the relative runtime of various memory allocators across three different workloads: allocation-only, deallocation-only, and mixed. In the allocation-only workload, 2^28 allocations of 64 bytes each are performed, while in the deallocation-only 2^28 deallocations of previously allocated objects are performed. Finally, the mixed workload consists of 2^22 allocations followed by 2^22 deallocations, repeated 64 times. The allocators compared include the C library allocator (malloc()/free()), C++ runtime allocator (new[]/delete[]), Fixed Arena Allocator (FAA), variable-capacity Arena Allocator (AA), and Concurrent Power-of-2 Arena Allocator (CP2AA).
Our results show that specialized allocators like FAA, AA, and CP2AA significantly outperform general-purpose allocators (malloc()/new[]), especially in allocation-intensive workloads, achieving around a 4× speedup in mixed workloads. While AA delivers the best performance, its lack of thread safety makes it unsuitable for parallel applications. In contrast, CP2AA balances performance with thread safety, making it a suitable choice.
Refer to our technical report for more details: \ Performance Comparison of Graph Representations Which Support Dynamic Graph Updates.
[!NOTE] You can just copy
main.shto your system and run it. \ For the code, refer tomain.cxx.
Code structure
The code structure is as follows:
bash
- inc/_cmath.hxx: Math functions
- inc/_memory.hxx: The memory allocators (FAA, AA, CP2AA)
- inc/_main.hxx: Main utility functions header
- inc/main.hxx: Main header (yes, two main headers)
- main.cxx: Experimentation code
- process.js: Node.js script for processing output logs
Note that each branch in this repository contains code for a specific experiment. The main branch contains code for the final experiment. If the intention of a branch in unclear, or if you have comments on our technical report, feel free to open an issue.
References
- Algorithm 1037: SuiteSparse:GraphBLAS: Parallel Graph Algorithms in the Language of Sparse Linear Algebra; Timothy A. Davis et al. (2023)
- Low-latency graph streaming using compressed purely-functional trees; Laxman Dhulipala et al. (2019)
- cuGraph C++ primitives: vertex/edge-centric building blocks for parallel graph computing; Seunghwa Kang et al. (2023)
- SNAP: A General-Purpose Network Analysis and Graph-Mining Library; Jure Leskovec et al. (2016)
- The University of Florida Sparse Matrix Collection; Timothy A. Davis et al. (2011)
Owner
- Name: puzzlef
- Login: puzzlef
- Kind: organization
- Website: https://puzzlef.github.io/
- Repositories: 10
- Profile: https://github.com/puzzlef
A summary of experiments.
Citation (CITATION.bib)
@article{sahu2025performance,
title={Performance Comparison of Graph Representations Which Support Dynamic Graph Updates},
author={Sahu, Subhajit},
journal={arXiv preprint arXiv:2502.13862},
year={2025}
}
GitHub Events
Total
- Public event: 1
- Push event: 2
Last Year
- Public event: 1
- Push event: 2
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: about 1 month
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 4.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: about 1 month
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 4.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- wolfram77 (1)
