parallel-clustering-hybrid
Hybrid Parallel MPI and OpenMP implementations of clustering algorithms. This code was developed for my university thesis.
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: researchgate.net -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.5%) to scientific vocabulary
Repository
Hybrid Parallel MPI and OpenMP implementations of clustering algorithms. This code was developed for my university thesis.
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Hybrid Parallel Clustering Algorithms
Title: Development and Evaluation of Parallel Clustering Algorithms in Hybrid Enviroment using OpenMP and MPI
Abstract: The object of this thesis will be the design, development and evaluation, in a parallel environment of shared memory, distributed memory, and hybrid form (massive parallel programming in a combined environment of distributed-shared memory), of efficient algorithms for the problem of data clustering. The development of the algorithms that will be selected will be done in C/C++ language and their evaluation will be done in a suitable real environment. Individual implementations in OpenMP and/or MPI as well as combined implementations will be developed indicatively, such as e.g. using MPI+OpenMP and/or using MPI+MPI Shared Memory, and corresponding comparative measurements and conclusions will be drawn.
- Thesis in greek: https://polynoe.lib.uniwa.gr/xmlui/handle/11400/8820
For this thesis parallel implementations were made for the clustering algorithms Kmeans and CURE. Further implementations of parallel clustering algorithms may be added in this repo.
Instructions
Compile the code in src folder using Makefile
makeHow to execute: ``` ./KmeansSerial
threshold> ./KmeansOpenMP<distance threshold><distance mpirun -n ./KmeansMPI threshold> mpirun -n<distance ./KmeansHybrid threshold><distance ./Cure_Serial <filename> <clusters> <representatives> <shrink fraction> ./Cure_OpenMP <filename> <clusters> <representatives> <shrink fraction> <OpenMP threads>mpirun -n
./CureMPI Hybridmpirun -n ./Cure ``` Example Runs: ``` ./KmeansSerial.out ../inputs/test1K.txt 3 1 ./KmeansOpenMP.out ../inputs/test1K.txt 3 1 4 mpirun -n 4 ./KmeansMPI.out ../inputs/test1K.txt 3 1 mpirun -n 4 ./KmeansHybrid.out ../inputs/test1K.txt 3 1 4
./Cure_Serial.out ../inputs/test1K.txt 3 5 0.4 ./Cure_OpenMP.out ../inputs/test1K.txt 3 5 0.4 4mpirun -n 4 ./CureMPI.out ../inputs/test1K.txt 3 5 0.4 mpirun -n 4 ./CureHybrid.out ../inputs/test1K.txt 3 5 0.4 4 ```
If gnuplot is available, a scatter plot will be saved in src/output folder.
A txt file with the terminal output will be saved in src/output folder.
Resources
- Hadjidoukas, Panagiotis & Amsaleg, Laurent. (2006). Parallelization of a Hierarchical Data Clustering Algorithm Using OpenMP. 4315. 289-299. 10.1007/978-3-540-68555-5_24. Link
- Zhang, Jing & Wu, Gongqing & Xuegang, Hu & Li, Shiying & Hao, Shuilong. (2011). A Parallel K-Means Clustering Algorithm with MPI. 10.1109/PAAP.2011.17. Link
Owner
- Name: Lefti
- Login: Lefti97
- Kind: user
- Location: Athens, Greece
- Company: University of West Attica
- Repositories: 2
- Profile: https://github.com/Lefti97
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Vangelis" given-names: "Lefteris" title: "parallel-clustering-hybrid" version: 1.0.0 date-released: 2025-03-08 url: "https://github.com/Lefti97/parallel-clustering-hybrid"
GitHub Events
Total
- Watch event: 1
- Push event: 2
- Public event: 1
Last Year
- Watch event: 1
- Push event: 2
- Public event: 1