posit_gpu

https://github.com/dadeba/posit_gpu

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 6 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.7%) to scientific vocabulary

Last synced: 9 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: dadeba
License: bsd-3-clause
Language: C++
Default Branch: main
Size: 58.6 KB

Statistics

Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created almost 2 years ago · Last pushed almost 2 years ago

Metadata Files

Readme License Citation

GEMM Routines in Posit for GPUs

We have ported the addition and multiplication routines from SoftPosit as OpenCL kernels. We also created GEMM routines in 32-bit Posit arithmetic. These programs were used for benchmarking in our paper presented at HPC Asia 2024. The paper is also published in arxiv

Part of the code is derived from MPLAPACK.

Build Instructions

Clone the SoftPosit repository: bash $ git clone https://gitlab.com/cerlane/SoftPosit.git
Apply the patch and build SoftPosit: bash $ cd SoftPosit $ patch -p1 < ../SoftPosit.patch $ cd build/Linux-x86_64-GCC $ make $ cd ../../..
Build the project: bash $ make

Test Programs

All programs use GEMM routines in 32-bit Posit arithmetic. You can specify the blocking size for the GEMM routines through an environment variable.

Example of setting the block size to 16: bash $ export OPENCL_GEMM_BLOCKSIZE=16

The performance of all programs can vary slightly depending on the blocking size.

GEMM

run_gemm
rungemmtrailing

LU decomposition

run_lu
runlubench
runlucheck
runlupower_bench

Cholesky decomposition

run_cho
runchobench
runchocheck

Reference

bibtex @inproceedings{10.1145/3635035.3635046, author = {Nakasato, Naohito and Murakami, Yuki and Kono, Fumiya and Nakata, Maho}, title = {Evaluation of POSIT Arithmetic with Accelerators}, year = {2024}, isbn = {9798400708893}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3635035.3635046}, doi = {10.1145/3635035.3635046}, booktitle = {Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region}, pages = {62–72}, numpages = {11}, location = {Nagoya, Japan}, series = {HPCAsia '24} }

Owner

Name: N.Nakasato
Login: dadeba
Kind: user
Location: Japan
Company: University of Aizu

Repositories: 14
Profile: https://github.com/dadeba

Citation (CITATION.bib)

@inproceedings{10.1145/3635035.3635046,
author = {Nakasato, Naohito and Murakami, Yuki and Kono, Fumiya and Nakata, Maho},
title = {Evaluation of POSIT Arithmetic with Accelerators},
year = {2024},
isbn = {9798400708893},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3635035.3635046},
doi = {10.1145/3635035.3635046},
abstract = {We present an evaluation of 32-bit POSIT arithmetic through its implementation as accelerators on FPGAs and GPUs. POSIT, a floating-point number format, adaptively changes the size of its fractional part. We developed hardware designs for FPGAs and software for GPUs to accelerate linear algebra operations using Posit(32,2) arithmetic. Our FPGA- and GPU-based accelerators in Posit(32,2) arithmetic significantly accelerated the Cholesky and LU decomposition algorithms for dense matrices. In terms of numerical accuracy, Posit(32,2) arithmetic is approximately 0.5 - 1.0 digits more accurate than the standard 32-bit format, especially when the norm of the elements of the input matrix is close to 1. Evaluating power consumption, we observed that the power efficiency of the accelerators ranged between 0.043 - 0.076 Gflops/watts for the LU decomposition in Posit(32,2) arithmetic. The power efficiency of the latest GPUs as accelerators of Posit(32,2) arithmetic is better than that of the evaluated FPGA chip.},
booktitle = {Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region},
pages = {62–72},
numpages = {11},
location = {Nagoya, Japan},
series = {HPCAsia '24}
}

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science