cudamatrixtranspose

Optimizing matrix transposition on GPU with CUDA (University of Trento, Italy)

https://github.com/lucazzola/cudamatrixtranspose

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.4%) to scientific vocabulary

Keywords

cuda-programming matrix-transposition parallel-programming
Last synced: 4 months ago · JSON representation ·

Repository

Optimizing matrix transposition on GPU with CUDA (University of Trento, Italy)

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
cuda-programming matrix-transposition parallel-programming
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

Matrix transposition : from sequential to parallel with CUDA

The following repository contains all the material related to both the homeworks on Matrix Transposition assigned during the GPU computing course : University of Trento (Italy) a.y. 2023/2024.

To see the report and better understand what this work is about, click Here

Matrix Transposition


How to use

Download the directory git clone https://github.com/LuCazzola/cudaMatrixTranspose.git

Here follows the Hierarchy of relevant project's files : ```bash

. ├── bin # final executables │ └── ... ├── obj # intermediate object files │ └── ... └── src # source code │ ├── headers # header files │ │ └── ...
│ ├── benchmark.c # produce an output file according to options in "runbenchmark.sh" │ ├── benchmarkgpu.cu # produce an output file according to options in "launchbenchmark.sh" │ . │ ├── main.c # test the functions according to options in "runmain.sh" │ ├── maingpu.cu # test the functions according to options in "launchmain.sh" │ . │ ├── transpose.c # functions to compute the transpose of a given matrix │ ├── transposegpu.c │ . │ ├── matrix.c # definition of methods to handle matrices │ ├── optparser.c # command line parameter parsing │ . │ └── commoncuda.cu # defines some common functions for cuda methods │ ├── runbenchmark.sh # set parameters related to "benchmark.c" and run the script ├── runmain.sh # set parameters related to "main.c" and run the script ├── runcachebenchmark.sh # run cachegrind to benchmark cache miss % on specified function │ . ├── launchbenchmark.sh # set parameters related to "benchmarkgpu.cu" and run the script on SLURM system ├── launchmain.sh # set parameters related to "maingpu.cu" and run the script on SLURM system │ . ├── data # data gathered via "runbenchmark.sh" & "launchbenchmark.sh" │ └── ... ├── plotdata.py # generates graphs using the data stored in "data" folder │ ├── Makefile └── ... ```

Main commands

Makefile defines 4 rules : * make : builds object files and homework-1 + homework-2 executables * make debug : builds object files and ALL executables adding debugging flags * make benchmark : builds object files and benchmark + benchmark_gpu executable * make clean : cleans all object files
There are many pre-set scripts to choose from :

CPU scripts section ( Homework-1 )
GPU scripts section ( Homework-2 )



CPU test commands ( Homework-1 )

NOTE

Go first inside the repository before running the scripts cd cudaMatrixTranspose



COMMANDS

"run_main.sh" script sets parameters related to homework-1 executable and runs it.
To change run parameters and have a better understanding of its functionalities see : run_main.sh make ./run_main.sh


"run_benchmark.sh" script sets parameters related to benchmark executable and runs it.
extracted data can be found on the data folder
To change run parameters and have a better understanding of its functionalities see : run_benchmark.sh make benchmark ./run_benchmark.sh


"runcachebenchmark.sh" script sets parameters related to homework-1 and runs Cachegrind on it, extracting localized informations about cache misses inside transposenaive() or transposeblocks() functions (according to the chosen parameter "method")
To change run parameters and have a better understanding of its functionalities see : runcachebenchmark.sh make clean make debug ./run_cache_benchmark.sh



GPU test commands ( Homework-2 )

NOTE

Please consider that the following commands are supposed to be ran on the Marzola DISI cluster, modify the launch_main.sh & launch_benchmark.sh scripts if needed to change partition or SLURM system.

Outside the cloned project folder upload the project's directory to the login node scp -r cudaMatrixTranspose <YOUR USERNAME>@marzola.disi.unitn.it:/home/<YOUR USERNAME> Then login and go inside the project's folder cd cudaMatrixTranspose module load cuda



COMMANDS

"launch_main.sh" script sets parameters related to homework-2 executable and runs it.
To change run parameters and have a better understanding of its functionalities see : launch_main.sh make sbatch launch_main.sh To visualize the results, once the node returns do: cat output.out

"launch_benchmark.sh" script sets parameters related to benchmark_gpu executable and runs it.
extracted data can be found on the data folder
To change run parameters and have a better understanding of its functionalities see : launch_benchmark.sh make benchmark sbatch launch_benchmark.sh



Graph Plotting

Inside the project's directory there's also a python script which take's the content of data folder and generates 2 types of graphs

  • x : Matrix size - y : Mean execution time
  • x : Matrix size - y : Mean effective bandwidth


Test it by running (on you own device) : python3 plot_data.py You can customize what information to plot inside the script



Extra Customization

It's also possible to change some other parameters at compilation level (optimization level and matrix element data type) by changing some variables in the makefile) :

Owner

  • Login: LuCazzola
  • Kind: user

Citation (CITATION.cff)

@software{matTrans_LucaC,
  author = {Luca Cazzola},
  month = {5},
  title = {{cuda inplace matrix transpose}},
  url = {https://github.com/LuCazzola/cudaMatrixTranspose,
  version = {1.0},
  year = {2024}
}

GitHub Events

Total
Last Year