alpaka

Abstraction Library for Parallel Kernel Acceleration :llama:

https://github.com/alpaka-group/alpaka

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 7 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org
✓
Committers with academic emails
13 of 54 committers (24.1%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.8%) to scientific vocabulary

Keywords

cpp cpp17 cuda gpu header-only heterogeneous-parallel-programming hip hpc openacc openmp rocm tbb

Last synced: 6 months ago · JSON representation

Repository

Abstraction Library for Parallel Kernel Acceleration :llama:

Basic Info

Host: GitHub
Owner: alpaka-group
License: mpl-2.0
Language: C++
Default Branch: develop
Homepage: https://alpaka.readthedocs.io
Size: 18.6 MB

Statistics

Stars: 390
Watchers: 21
Forks: 81
Open Issues: 224
Releases: 21

Topics

cpp cpp17 cuda gpu header-only heterogeneous-parallel-programming hip hpc openacc openmp rocm tbb

Created over 11 years ago · Last pushed 6 months ago

Metadata Files

Readme Changelog Contributing License Zenodo

alpaka - Abstraction Library for Parallel Kernel Acceleration

The alpaka library is a header-only C++20 abstraction library for accelerator development.

Its aim is to provide performance portability across accelerators through the abstraction (not hiding!) of the underlying levels of parallelism.

It is platform independent and supports the concurrent and cooperative use of multiple devices such as the hosts CPU (x86, ARM, RISC-V and Power 8+) and GPU accelerators from different vendors (NVIDIA, AMD and Intel). A multitude of accelerator back-end variants using CUDA, HIP, SYCL, OpenMP 2.0+, std::thread and also serial execution is provided and can be selected depending on the device. Only one implementation of the user kernel is required by representing them as function objects with a special interface. There is no need to write special CUDA, HIP, SYCL, OpenMP or custom threading code. Accelerator back-ends can be mixed and synchronized via compute device queue. The decision which accelerator back-end executes which kernel can be made at runtime.

The abstraction used is very similar to the CUDA grid-blocks-threads domain decomposition strategy. Algorithms that should be parallelized have to be divided into a multi-dimensional grid consisting of small uniform work items. These functions are called kernels and are executed in parallel threads. The threads in the grid are organized in blocks. All threads in a block are executed in parallel and can interact via fast shared memory and low level synchronization methods. Blocks are executed independently and can not interact in any way. The block execution order is unspecified and depends on the accelerator in use. By using this abstraction the execution can be optimally adapted to the available hardware.

Software License

alpaka is licensed under MPL-2.0.

Documentation

The alpaka documentation can be found in the online manual. The documentation files in .rst (reStructuredText) format are located in the docs subfolder of this repository. The source code documentation is generated with doxygen.

Accelerator Back-ends

| Accelerator Back-end | Lib/API | Devices | Execution strategy grid-blocks | Execution strategy block-threads | |------------------------|---------------------------------------------------------|----------------------------|------------------------------------|--------------------------------------| | Serial | n/a | Host CPU (single core) | sequential | sequential (only 1 thread per block) | | OpenMP 2.0+ blocks | OpenMP 2.0+ | Host CPU (multi core) | parallel (preemptive multitasking) | sequential (only 1 thread per block) | | OpenMP 2.0+ threads | OpenMP 2.0+ | Host CPU (multi core) | sequential | parallel (preemptive multitasking) | | std::thread | std::thread | Host CPU (multi core) | sequential | parallel (preemptive multitasking) | | TBB | TBB 2.2+ | Host CPU (multi core) | parallel (preemptive multitasking) | sequential (only 1 thread per block) | | CUDA | CUDA 12.0+ | NVIDIA GPUs | parallel (undefined) | parallel (lock-step within warps) | | HIP(clang) | HIP 6.0+ | AMD GPUs | parallel (undefined) | parallel (lock-step within warps) | | SYCL(oneAPI) | oneAPI 2024.2+ | CPUs, Intel GPUs and FPGAs | parallel (undefined) | parallel (lock-step within warps) |

Supported Compilers

This library uses C++20 (or newer when available).

| Accelerator Back-end | gcc 11.1 (Linux) | gcc 12.3 (Linux) | gcc 13.1 (Linux) | clang 14 (Linux) | clang 15 (Linux) | clang 16 (Linux) | clang 17 (Linux) | clang 18 (Linux) | clang 19 (Linux) | icpx 2025.0 (Linux) | Xcode 15.4 / 16.1 (macOS) | Visual Studio 2022 (Windows) | |----------------------|--------------------------------|---------------------------------------|---------------------------------------|--------------------------------|--------------------------------|--------------------------------|---------------------------------------|---------------------------------------|--------------------|-------------------------|---------------------------|------------------------------| | Serial | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | | OpenMP 2.0+ blocks | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: [^1] | :whitecheckmark: | :whitecheckmark: | | OpenMP 2.0+ threads | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: [^1] | :whitecheckmark: | :whitecheckmark: | | std::thread | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | | TBB | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | :whitecheckmark: | | CUDA (nvcc) | :whitecheckmark: (CUDA 12.0) | :whitecheckmark: (CUDA 12.0 - 12.5) | :whitecheckmark: (CUDA 12.4 - 12.5) | :whitecheckmark: (CUDA 12.0) | :whitecheckmark: (CUDA 12.2) | :whitecheckmark: (CUDA 12.3) | :whitecheckmark: (CUDA 12.4 - 12.5) | :whitecheckmark: (CUDA 12.4 - 12.5) | :x: | :x: | - | :x: | | CUDA (clang) | - | - | - | :x: | :x: | :x: | :x: | :x: | :x: | :x: | - | - | | HIP (clang) | - | - | - | :x: | :x: | :x: | :whitecheckmark: (HIP 6.0 - 6.1) | :whitecheckmark: (HIP 6.2) | :x: | :x: | - | - | | SYCL | :x: | :x: | :x: | :x: | :x: | :x: | :x: | :x: | :x: | :whitecheckmark: [^2] | - | :x: |

Other compilers or combinations marked with :x: in the table above may work but are not tested in CI and are therefore not explicitly supported.

[^1]: Due to an LLVM bug in debug mode only release builds are supported. [^2]: Currently, the unit tests are compiled but not executed.

Dependencies

The alpaka library itself just requires header-only libraries. However some of the accelerator back-end implementations require different boost libraries to be built.

When an accelerator back-end using CUDA is enabled, version 12.0 (with nvcc as CUDA compiler) or version 12.0 (with clang as CUDA compiler) of the CUDA SDK is the minimum requirement. NOTE: When using clang as a native CUDA compiler, the CUDA accelerator back-end can not be enabled together with any OpenMP accelerator back-end because this combination is currently unsupported. NOTE: Separable compilation is disabled by default and can be enabled via the CMake flag CMAKE_CUDA_SEPARABLE_COMPILATION.

When an accelerator back-end using OpenMP is enabled, the compiler and the platform have to support the corresponding minimum OpenMP version.

When an accelerator back-end using TBB is enabled, the compiler and the platform have to support the corresponding minimum TBB version.

Boost 1.78.0+ is an optional external dependency, if the used C++ standard library does not support std::atomic_ref.

Usage

The library is header only so nothing has to be built. CMake 3.22+ is required to provide the correct defines and include paths. Just call alpaka_add_executable instead of add_executable and the difficulties of the CUDA nvcc compiler in handling .cu and .cpp files are automatically taken care of. Source files do not need any special file ending. Examples of how to utilize alpaka within CMake can be found in the example folder.

The whole alpaka library can be included with: #include <alpaka/alpaka.hpp> Code that is not intended to be utilized by the user is hidden in the detail namespace.

Furthermore, for a CUDA-like experience when adopting alpaka we provide the library cupla. It enables a simple and straightforward way of porting existing CUDA applications to alpaka and thus to a variety of accelerators.

Single header

The CI creates a single-header version of alpaka on each commit, which you can find on the single-header branch.

This is especially useful, if you would like to play with alpaka on Compiler explorer. Just include alpaka like ```c++

include https://raw.githubusercontent.com/alpaka-group/alpaka/single-header/include/alpaka/alpaka.hpp

``and enable the desired backend on the compiler's command line using the corresponding macro, e.g. via-DALPAKAACCCPUBSEQTSEQ_ENABLED`.

Introduction

For a quick introduction, feel free to playback the recording of our presentation at GTC 2016:

E. Zenker, R. Widera, G. Juckeland et al., Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka, video link (39 min), slides (PDF), DOI:10.5281/zenodo.6336086

Citing alpaka

Currently all authors of alpaka are scientists or connected with research. For us to justify the importance and impact of our work, please consider citing us accordingly in your derived work and publications:

```latex % Peer-Reviewed Publication %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % % Peer reviewed and accepted publication in % "2nd International Workshop on Performance Portable % Programming Models for Accelerators (P^3MA)" % colocated with the % "2017 ISC High Performance Conference" % in Frankfurt, Germany @inproceedings{MathesP3MA2017, author = {{Matthes}, A. and {Widera}, R. and {Zenker}, E. and {Worpitz}, B. and {Huebl}, A. and {Bussmann}, M.}, title = {Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the Alpaka library}, archivePrefix = "arXiv", eprint = {1706.10086}, keywords = {Computer Science - Distributed, Parallel, and Cluster Computing}, day = {30}, month = {Jun}, year = {2017}, url = {https://arxiv.org/abs/1706.10086}, }

% Peer-Reviewed Publication %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % % Peer reviewed and accepted publication in % "The Sixth International Workshop on % Accelerators and Hybrid Exascale Systems (AsHES)" % at the % "30th IEEE International Parallel and Distributed % Processing Symposium" in Chicago, IL, USA @inproceedings{ZenkerAsHES2016, author = {Erik Zenker and Benjamin Worpitz and Ren{\'{e}} Widera and Axel Huebl and Guido Juckeland and Andreas Kn{\"{u}}pfer and Wolfgang E. Nagel and Michael Bussmann}, title = {Alpaka - An Abstraction Library for Parallel Kernel Acceleration}, archivePrefix = "arXiv", eprint = {1602.08477}, keywords = {Computer science;CUDA;Mathematical Software;nVidia;OpenMP;Package; performance portability;Portability;Tesla K20;Tesla K80}, day = {23}, month = {May}, year = {2016}, publisher = {IEEE Computer Society}, url = {http://arxiv.org/abs/1602.08477}, }

% Original Work: Benjamin Worpitz' Master Thesis %%%%%%%%%% % @MasterThesis{Worpitz2015, author = {Benjamin Worpitz}, title = {Investigating performance portability of a highly scalable particle-in-cell simulation code on various multi-core architectures}, school = {{Technische Universit{\"{a}}t Dresden}}, month = {Sep}, year = {2015}, type = {Master Thesis}, doi = {10.5281/zenodo.49768}, url = {http://dx.doi.org/10.5281/zenodo.49768} } ```

Contributing

Rules for contributions can be found in CONTRIBUTING.md. Any pull request will be reviewed by a maintainer.

Thanks to all active and former contributors.

Owner

Name: alpaka
Login: alpaka-group
Kind: organization
Location: Dresden, Germany

Website: http://www.hzdr.de/crp
Repositories: 9
Profile: https://github.com/alpaka-group

Abstraction Library for Parallel Kernel Acceleration

GitHub Events

Total

Issues event: 41
Watch event: 33
Delete event: 2
Issue comment event: 294
Push event: 92
Pull request review event: 312
Pull request review comment event: 215
Pull request event: 155
Fork event: 7
Create event: 2

Last Year

Issues event: 41
Watch event: 33
Delete event: 2
Issue comment event: 294
Push event: 92
Pull request review event: 312
Pull request review comment event: 215
Pull request event: 155
Fork event: 7
Create event: 2

Committers

Last synced: 9 months ago

All Time

Total Commits: 2,559
Total Committers: 54
Avg Commits per committer: 47.389
Development Distribution Score (DDS): 0.679

Past Year

Commits: 201
Committers: 12
Avg Commits per committer: 16.75
Development Distribution Score (DDS): 0.731

Top Committers

Name	Email	Commits
Benjamin Worpitz	b**z@g**m	821
René Widera	r**a@h**e	298
Bernhard Manfred Gruber	b**r@g**m	280
Jeffrey Kelling	j**g@h**e	233
Andrea Bocci	a**i@c**h	171
Jan Stephan	j**n@h**e	156
Simeon Ehrig	s**g@h**e	109
Sergei Bastrakov	s**v@g**m	89
Axel Huebl	a**l@p**a	61
AuroraPerego	a**o@c**h	46
mehmet yusufoglu	m**1@g**m	39
Matthias Werner	M**1@t**e	32
Tools	a**a@h**e	27
tonydp03	t**3@g**m	27
Jakob	j**e@h**m	24
Erik Zenker	e**r@p**e	17
kloppstock	j**s@g**e	13
Julian Lenz	j**z@h**e	12
Alexander Matthes	z**z@m**g	10
Tapish	n**3@h**e	9
Benjamin Worpitz	b**z@l**m	8
Jiří Vyskočil	j**i@v**m	8
Felice Pantaleo	f**o@c**h	6
Mehmet Yusufoglu	m**t@r**i	6
Matthias Werner	m**1@t**e	6
David M. Rogers	p**h@g**m	5
Luca	f**a@g**m	4
Erik Zenker	e**r@h**m	4
Sven Erdem	s**m@h**e	3
Bert Wesarg	b**g@t**e	3
and 24 more...

Committer Domains (Top 20 + Academic)

hzdr.de: 8 cern.ch: 5 tu-dresden.de: 4 plasma.ninja: 1 posteo.de: 1 groeger-clan.de: 1 mailbox.org: 1 logmein.com: 1 vyskocil.com: 1 rtr.ai: 1 helmholtz-berlin.de: 1 github.de: 1 zom.bi: 1 maven.de: 1 philnash.me: 1 vysko.cz: 1 mailbox.tu-dresden.de: 1 math.sci.hiroshima-u.ac.jp: 1 manieth.com: 1 nvidia.com: 1 gatech.edu: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 181
Total pull requests: 559
Average time to close issues: 8 months
Average time to close pull requests: 20 days
Total issue authors: 21
Total pull request authors: 21
Average comments per issue: 3.41
Average comments per pull request: 2.5
Merged pull requests: 414
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 39
Pull requests: 180
Average time to close issues: 8 days
Average time to close pull requests: 6 days
Issue authors: 10
Pull request authors: 12
Average comments per issue: 1.15
Average comments per pull request: 1.97
Merged pull requests: 117
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

fwyzard (38)
SimeonEhrig (32)
bernhardmgruber (29)
psychocoderHPC (18)
j-stephan (18)
mehmetyusufoglu (7)
BenjaminW3 (6)
jkelling (5)
GNiendorf (5)
chillenzer (4)
AuroraPerego (3)
ichinii (3)
StewMH (3)
krzikalla (2)
kakwok (2)

Pull Request Authors

SimeonEhrig (119)
fwyzard (112)
mehmetyusufoglu (86)
psychocoderHPC (79)
j-stephan (47)
bernhardmgruber (35)
AuroraPerego (35)
ikbuibui (13)
chillenzer (12)
ichinii (6)
sliwowitz (6)
MichaelVarvarin (5)
sbaldu (4)
m-fila (2)
jkelling (2)

Top Labels

Issue Labels

Type:Enhancement (54) Type:Bug (39) Type:Testing (29) Backend:CUDA (26) Backend:SYCL (20) Type:Question (19) Type:Refactoring (19) Backend:HIP (17) Type:CMake (15) Type:Documentation (11) Backend:OpenMP (5) Type:Install (5) Backend:OpenACC (4) Type:Example (4) Priority: 1 (3) OS:macOS (3) Backend:std::thread (3) Backend:TBB (3) Backend:Serial (2) OS:Linux (2) Priority: 2 (1) State:Duplicate (1) Type:Relicense (1) Release:Optional (1) Backend:Boost.Fiber (1) OS:Windows (1) State:Wontfix (1) Type:Machine/System (1) State:Help Wanted (1)

Pull Request Labels

Type:Testing (148) Type:Bug (131) Type:Enhancement (117) Type:Refactoring (89) Backend:SYCL (67) Type:Documentation (63) Backend:CUDA (47) Backend:HIP (35) Type:Example (34) Type:CMake (31) Backend:OpenMP (21) Backend:std::thread (12) Backend:Serial (12) Type:Warning (9) OS:Windows (8) Type:Install (8) Backend:TBB (8) OS:Linux (7) OS:macOS (6) Backend:OpenACC (3) Backend:Boost.Fiber (2) State:Help Wanted (2) State:Work In Progress (2)

Packages

Total packages: 1
Total downloads: unknown

Total dependent packages: 1
Total dependent repositories: 0
Total versions: 6
Total maintainers: 1

spack.io: alpaka

Abstraction Library for Parallel Kernel Acceleration.

Homepage: https://github.com/alpaka-group/alpaka
License: []
Latest release: 0.8.0
published almost 4 years ago

Versions: 6
Dependent Packages: 1
Dependent Repositories: 0

Rankings

Dependent repos count: 0.0%

Average: 14.0%

Dependent packages count: 28.1%

Maintainers (1)

vvolkl

Last synced: 6 months ago

Dependencies

.github/workflows/ci.yml actions

DoozyX/clang-format-lint-action v0.14 composite
actions/checkout v3 composite
actions/upload-artifact v3 composite

.github/workflows/gh-pages.yml actions

actions/checkout v3 composite

docs/requirements.txt pypi

Jinja2 <3.0
breathe ==4.16.0
markupsafe <2.0.0
pygments *
rst2pdf *
sphinx ==3.0.3
sphinx_rtd_theme >=0.3.1
sphinxcontrib.programoutput *

script/job_generator/requirements.txt pypi

allpairspy ==2.5.0
alpaka-job-coverage >=1.2.1
pyaml *
typeguard *
types-PyYAML *

.github/workflows/single-header.yml actions

actions/checkout v3 composite

alpaka

Science Score: 59.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

alpaka - Abstraction Library for Parallel Kernel Acceleration

Software License

Documentation

Accelerator Back-ends

Supported Compilers

Dependencies

Usage

Single header

include https://raw.githubusercontent.com/alpaka-group/alpaka/single-header/include/alpaka/alpaka.hpp

Introduction

Citing alpaka

Contributing

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

spack.io: alpaka

Rankings

Maintainers (1)

Dependencies