spmv-acc

HIP acceleration of SpMV solver

https://github.com/hpcde/spmv-acc

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.3%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

HIP acceleration of SpMV solver

Basic Info
  • Host: GitHub
  • Owner: hpcde
  • License: apache-2.0
  • Language: C++
  • Default Branch: main
  • Size: 751 KB
Statistics
  • Stars: 10
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 2
Created almost 5 years ago · Last pushed about 1 year ago
Metadata Files
Readme Changelog License Citation

README.md

spmv-acc

HIP acceleration for SpMV solver.

Citing SpMV-acc

Please cite SpMV-acc in your publications if it helps your research: ```bib

For github user: you can also click the link in this repository landing page at the right sidebar, with the label "Cite this repository."

@inproceedings{chuspmvgpu:_2023, title = {Efficient Algorithm Design of Optimizing SpMV on GPU}, isbn = {979-8-4007-0155-9/23/06}, url = {http://doi.org/10.1145/3588195.3593002}, doi = {10.1145/3588195.3593002}, language = {en}, urldate = {2023-6-20}, booktitle = {Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing (HPDC '23), June 16--23, 2023, Orlando, FL, USA}, publisher = {ACM Press}, author = {Chu, Genshen and He, Yuanjie and Dong, Lingyu and Ding, Zhezhao and Chen, Dandan and Bai, He and Wang, Xuesong and Hu, Changjun}, year = {2023}, numpages = {14}, series = {HPDC '23}, address = {Orlando, Florida}, location = {Orlando, FL, USA}, pages = {1--14}, } ```

Build

Pre-requirements

  • ROCM: version 3.x or higher. For example: module load compiler/rocm/3.9.1
  • HIP
  • CMake: version 3.6 or higher.

Download dependency

Before building, we need to download clipp for command line arguments processing. ```bash

pkg: https://github.com/genshen/pkg

pkg fetch pkg install ```

Build steps

  • Build and verify on GPU side. (Note: make sure lib rocsparse is loaded and its version must be greater/equal than "1.19.4 for ROCm 4.1.0"): bash CC=clang CXX=hipcc cmake -DDEVICE_SIDE_VERIFY_FLAG=ON -DCMAKE_BUILD_TYPE=Release -B./build-hip -S./ cmake --build ./build-hip ./build-hip/bin/spmv-cli examples/data/rajat03.csr -f csr

  • Build and verify on CPU side: bash CC=clang CXX=hipcc cmake -DCMAKE_BUILD_TYPE=Release -B./build-hip -S./ cmake --build ./build-hip ./build-hip/bin/spmv-cli examples/data/rajat03.csr -f csr

  • Build by specifying a kernel strategy (e.g., use strategy Adaptive): bash CC=clang CXX=hipcc cmake -DKERNEL_STRATEGY=ADAPTIVE -DCMAKE_BUILD_TYPE=Release -B./build-hip-adaptive -S./ cmake --build ./build-hip-adaptive ./build-hip-adaptive/bin/spmv-cli examples/data/rajat03.csr -f csr

Build with benchmark

```bash

please remember to change WARP_SIZE in hola-hip on rocm platform

cmake --list-presets cmake --preset=rocm-hipcc-benchmark # or use --preset=cuda-hipcc-benchmark cmake --build --preset=rocm-hipcc-benchmark -j 4 # or use --preset=cuda-hipcc-benchmark ```

For Developers

Add a new kernel strategy

A kernel strategy is an algorithm for calculating SpMV on device side.
You can specific another kernel strategy (algorithm) by following rules: 1. Edit src/configure.cmake to add a kernel strategy checking (e.g. add a strategy named awesome_spmv). diff +elseif (KERNEL_STRATEGY_LOWER MATCHES "awesome_spmv") + set(KERNEL_STRATEGY_AWESOME_SPMV ON) else () MESSAGE(FATAL_ERROR "unsupported kernel strategy ${KERNEL_STRATEGY}") endif () 2. Edit src/building_config.h.in for generating C/C++ Macro defines of the corresponding strategy. diff #cmakedefine KERNEL_STRATEGY_DEFAULT +#cmakedefine KERNEL_STRATEGY_AWESOME_SPMV 3. Edit file src/acc/CMakeLists.txt and add the kernel strategy name, then CMake can find the source files of the kernel strategy. e.g., diff # all enabled strategies set(ENABLED_STRATEGIES default + awesome_spmv 4. Create a new directory named hip-awesome-spmv (replace '_' in strategy name to '-') under src/acc directory and place your code for the new strategy to this directory.

  1. Add file source_list.cmake to directory src/acc/hip-awesome-spmv to include the source files of the new strategy. Please refer to file src/acc/hip/source_list.cmake for more details.

  2. Edit file src/acc/strategy_picker.cpp to call the entry function of the corresponding strategy. e.g., ```diff void sparse_spmv(int trans, const double alpha, const double beta, int m, int n, const int *rowptr, const int *colindex, const double *value, const double *x, double *y) {

    ifdef KERNELSTRATEGYDEFAULT

    defaultsparsespmv(trans, alpha, beta, m, n, rowptr, colindex, value, x, y);

    endif

    +#ifdef KERNELSTRATEGYAWESOMESPMV +awesomesparse_spmv(trans, alpha, beta, m, n, rowptr, colindex, value, x, y); +#endif ```

Owner

  • Name: hpcde
  • Login: hpcde
  • Kind: organization

High Performance Computing and Data Engineering Lab of USTB

Citation (CITATION.cff)

cff-version: 1.2.0
title: Efficient Algorithm Design of Optimizing SpMV on GPU
message: "If you use this software, please cite it as below."
authors:
  - family-names: Chu
    given-names: Genshen
    orcid: 'https://orcid.org/0000-0003-0374-1894'
  - family-names: He
    given-names: Yuanjie
    orcid: 'https://orcid.org/0009-0003-7115-6846'
  - family-names: Dong
    given-names: Lingyu
    orcid: 'https://orcid.org/0000-0003-0919-553X'
  - family-names: Ding
    given-names: Zhezhao
    orcid: 'https://orcid.org/0000-0003-3437-8151'
  - family-names: Chen
    given-names: Dandan
    orcid: 'https://orcid.org/0000-0002-9847-5092'
  - family-names: Bai
    given-names: He
    orcid: 'https://orcid.org/0000-0001-5418-0375'
  - family-names: Wang
    given-names: Xuesong
    orcid: 'https://orcid.org/0009-0000-2811-557X'
  - family-names: Hu
    given-names: Changjun
    orcid: 'https://orcid.org/0000-0003-3857-7262'
identifiers:
  - type: doi
    value: 10.1145/3588195.3593002
repository-code: 'https://github.com/hpcde/spmv-acc'
abstract: >-
  Sparse matrix-vector multiplication (SpMV) is a
  fundamental build- ing block for various numerical
  computing applications. However, most existing GPU-SpMV
  approaches may suffer from either long preprocessing
  overhead, load imbalance, format conversion, bad memory
  access patterns. In this paper, we proposed two new SpMV
  algorithms: flat and line-enhance, as well as their
  implementations, for GPU systems to overcome the above
  shortcomings. Our algorithms work directly on the CSR
  sparse matrix format. To achieve high performance: 1) for
  load balance, the flat algorithm uses non- zero splitting
  and line-enhance uses a mix of row and non-zero splitting;
  2) memory access patterns are designed for both algorithms
  for data loading, storing and reduction steps; and 3) an
  adaptive approach is proposed to select appropriate
  algorithm and parameters based on matrix characteristics.

  We evaluate our methods using the SuiteSparse Matrix
  Collec- tion on AMD and NVIDIA GPU platforms. Average
  performance improvements of 424%, 741%, 49%, 46%, 72% are
  achieved when comparing our adaptive approach with
  CSR-Vector, CSR-Adaptive, HOLA, cuSparse and merge-based
  SpMV, respectively. In bandwidth tests, our approach can
  also achieve a high memory bandwidth, which is very close
  to the peak memory bandwidth.
keywords:
  - SpMV
  - GPU
  - linear algebra
  - sparse matrix
  - CSR
license: Apache-2.0
version: 0.6.0
doi: 10.1145/3588195.3593002
date-released: 2022-04-18
url: "https://github.com/hpcde/spmv-acc"
preferred-citation:
  type: conference-paper
  authors:
    - family-names: Chu
      given-names: Genshen
      orcid: 'https://orcid.org/0000-0003-0374-1894'
    - family-names: He
      given-names: Yuanjie
      orcid: 'https://orcid.org/0009-0003-7115-6846'
    - family-names: Dong
      given-names: Lingyu
      orcid: 'https://orcid.org/0000-0003-0919-553X'
    - family-names: Ding
      given-names: Zhezhao
      orcid: 'https://orcid.org/0000-0003-3437-8151'
    - family-names: Chen
      given-names: Dandan
      orcid: 'https://orcid.org/0000-0002-9847-5092'
    - family-names: Bai
      given-names: He
      orcid: 'https://orcid.org/0000-0001-5418-0375'
    - family-names: Wang
      given-names: Xuesong
      orcid: 'https://orcid.org/0009-0000-2811-557X'
    - family-names: Hu
      given-names: Changjun
      orcid: 'https://orcid.org/0000-0003-3857-7262'
  doi: 10.1145/3588195.3593002
  title: Efficient Algorithm Design of Optimizing SpMV on GPU
  isbn: 979-8-4007-0155-9/23/06
  url: http://doi.org/10.1145/3588195.3593002
  language: en
  urldate: 2023-6-20
  booktitle: Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing (HPDC '23), June 16--23, 2023, Orlando, FL, USA
  publisher: ACM Press
#   author: Chu, Genshen and He, Yuanjie and Dong, Lingyu and Ding, Zhezhao and Chen, Dandan and Bai, He and Wang, Xuesong and Hu, Changjun
  numpages: 14
  series: HPDC '23
  address: Orlando, Florida
  location: Orlando, FL, USA
  pages: 1--14
  year: 2023

GitHub Events

Total
  • Watch event: 4
  • Delete event: 17
  • Push event: 29
  • Pull request review event: 1
  • Pull request event: 29
  • Fork event: 1
  • Create event: 20
Last Year
  • Watch event: 4
  • Delete event: 17
  • Push event: 29
  • Pull request review event: 1
  • Pull request event: 29
  • Fork event: 1
  • Create event: 20

Dependencies

tools/suitesparse-dl/go.mod go
  • github.com/andybalholm/cascadia v1.3.1
  • github.com/genshen/cmds v0.0.0-20200505065256-d4c52690e15b
  • github.com/pterm/pterm v0.12.32
  • golang.org/x/net v0.0.0-20211020060615-d418f374d309
tools/suitesparse-dl/go.sum go
  • github.com/MarvinJWendt/testza v0.1.0
  • github.com/MarvinJWendt/testza v0.2.1
  • github.com/MarvinJWendt/testza v0.2.9
  • github.com/andybalholm/cascadia v1.3.1
  • github.com/atomicgo/cursor v0.0.1
  • github.com/davecgh/go-spew v1.1.0
  • github.com/davecgh/go-spew v1.1.1
  • github.com/genshen/cmds v0.0.0-20200505065256-d4c52690e15b
  • github.com/gookit/color v1.4.2
  • github.com/mattn/go-runewidth v0.0.13
  • github.com/pmezard/go-difflib v1.0.0
  • github.com/pterm/pterm v0.12.27
  • github.com/pterm/pterm v0.12.29
  • github.com/pterm/pterm v0.12.30
  • github.com/pterm/pterm v0.12.32
  • github.com/rivo/uniseg v0.2.0
  • github.com/stretchr/objx v0.1.0
  • github.com/stretchr/testify v1.6.1
  • github.com/stretchr/testify v1.7.0
  • github.com/xo/terminfo v0.0.0-20210125001918-ca9a967f8778
  • golang.org/x/net v0.0.0-20210916014120-12bc252f5db8
  • golang.org/x/net v0.0.0-20211020060615-d418f374d309
  • golang.org/x/sys v0.0.0-20201119102817-f84b799fce68
  • golang.org/x/sys v0.0.0-20210330210617-4fbd30eecc44
  • golang.org/x/sys v0.0.0-20210423082822-04245dca01da
  • golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1
  • golang.org/x/sys v0.0.0-20211013075003-97ac67df715c
  • golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1
  • golang.org/x/term v0.0.0-20210220032956-6a3ed077a48d
  • golang.org/x/term v0.0.0-20210615171337-6886f2dfbf5b
  • golang.org/x/term v0.0.0-20210927222741-03fcf44c2211
  • golang.org/x/text v0.3.6
  • golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e
  • gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405
  • gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c
  • gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b
.devcontainer/Dockerfile docker
  • nvidia/cuda 12.4.1-devel-ubuntu22.04 build