matx

An efficient C++17 GPU numerical computing library with Python-like syntax

https://github.com/nvidia/matx

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (17.1%) to scientific vocabulary

Keywords

cuda gpgpu gpu gpu-computing hpc

Last synced: 6 months ago · JSON representation ·

Repository

An efficient C++17 GPU numerical computing library with Python-like syntax

Basic Info

Host: GitHub
Owner: NVIDIA
License: bsd-3-clause
Language: C++
Default Branch: main
Homepage: https://nvidia.github.io/MatX
Size: 20.7 MB

Statistics

Stars: 1,350
Watchers: 25
Forks: 104
Open Issues: 46
Releases: 18

Topics

cuda gpgpu gpu gpu-computing hpc

Created over 4 years ago · Last pushed 6 months ago

Metadata Files

Readme Contributing License Code of conduct Citation

MatX - GPU-Accelerated Numerical Computing in Modern C++

MatX is a modern C++ library for numerical computing on NVIDIA GPUs and CPUs. Near-native performance can be achieved while using a simple syntax common in higher-level languages such as Python or MATLAB.

FFT resampler

The above image shows the Python (Numpy) version of an FFT resampler next to the MatX version. The total runtimes of the NumPy version, CuPy version, and MatX version are shown below:

Python/Numpy: 5360ms (Xeon(R) CPU E5-2698 v4 @ 2.20GHz)
CuPy: 10.6ms (A100)
MatX: 2.54ms (A100)

While the code complexity and length are roughly the same, the MatX version shows a 2100x over the Numpy version, and over 4x faster than the CuPy version on the same GPU.

Key features include:

:zap: MatX is fast. By using existing, optimized libraries as a backend, and efficient kernel generation when needed, no hand-optimizations are necessary
:open_hands: MatX is easy to learn. Users familiar with high-level languages will pick up the syntax quickly
:bookmark_tabs: MatX easily integrates with existing libraries and code
:sparkler: Visualize data from the GPU right on a web browser
:arrowupdown: IO capabilities for reading/writing files

Requirements
Installation
- Building MatX
- Integrating MatX With Your Own Projects
Documentation
- Supported Data Types
Unit Tests
Quick Start Guide
Release History
Filing Issues
Contributing Guide

Requirements

MatX support is currently limited to Linux only due to the time to test Windows. If you'd like to voice your support for native Windows support using Visual Studio, please comment on the issue here: https://github.com/NVIDIA/MatX/issues/153.

Note: CUDA 12.0.0 through 12.2.0 have an issue that causes building MatX unit tests to show a compiler error or cause a segfault in the compiler. Please use CUDA 11.8 or CUDA 12.2.1+ with MatX.

MatX is using features in C++17 and the latest CUDA compilers and libraries. For this reason, when running with GPU support, CUDA 11.8 and g++9, nvc++ 24.5, or clang 17 or newer is required. You can download the CUDA Toolkit here.

MatX has been tested on and supports Volta, Ampere, Ada, Hopper, and Blackwell GPU architectures. Jetson products are supported with Jetpack 5.0 or above.

The MatX build system when used with CMake will automatically fetch packages from the internet that are missing or out of date. If you are on a machine without internet access or want to manage the packages yourself, please follow the offline instructions and pay attention to the required versions of the dependencies.

Note for CPU/Host support: CPU/Host execution support is nearly on par with GPU support. Currently all elementwise operators, reductions, and FFT/BLAS/LAPACK transforms are supported. Most host functions with the exception of reductions support multithreading. If you find a bug in an operator on CPU, please report it in the issues above. More detail can be found here documentation.

Installation

MatX is a header-only library that does not require compiling for using in your applications. However, building unit tests, benchmarks, or examples must be compiled. CPM is used as a package manager for CMake to download and configure any dependencies. If MatX is to be used in an air-gapped environment, CPM can be configured to search locally for files. Depending on what options are enabled, compiling could take very long without parallelism enabled. Using the -j flag on make is suggested with the highest number your system will accommodate.

Building MatX

To build all components, issue the standard cmake build commands in a cloned repo:

sh mkdir build && cd build cmake -DMATX_BUILD_TESTS=ON -DMATX_BUILD_BENCHMARKS=ON -DMATX_BUILD_EXAMPLES=ON -DMATX_BUILD_DOCS=OFF .. make -j

By default CMake will target the GPU architecture(s) of the system you're compiling on. If you wish to target other architectures, pass the CMAKE_CUDA_ARCHITECTURES flag with a list of architectures to build for:

sh cmake .. -DCMAKE_CUDA_ARCHITECTURES="80;90"

By default nothing is compiled. If you wish to compile certain options, use the CMake flags below with ON or OFF values:

MATX_BUILD_TESTS MATX_BUILD_BENCHMARKS MATX_BUILD_EXAMPLES MATX_BUILD_DOCS

For example, to enable unit test building: sh mkdir build && cd build cmake -DMATX_BUILD_TESTS=ON .. make -j

Integrating MatX With Your Own Projects

MatX uses CMake as a first-class build generator, and therefore provides the proper config files to include into your own project. There are typically two ways to do this: 1. Adding MatX as a subdirectory 2. Installing MatX to the system

1. MatX as a Subdirectory

Adding the subdirectory is useful if you include the MatX source into the directory structure of your project. Using this method, you can simply add the MatX directory:

cmake add_subdirectory(path/to/matx)

An example of using this method can be found in the examples/cmakesampleproject directory.

2. MatX Installed to the System

The other option is to install MatX and use the configuration file provided after building. This is typically done in a way similar to what is shown below:

sh cd /path/to/matx mkdir build && cd build cmake .. make && make install

If you have the correct permissions, the headers and cmake packages will be installed on your system in the expected paths for your operating system. With the package installed you can use find_package as follows:

cmake find_package(matx CONFIG REQUIRED)

MatX CMake Targets

Once either of the two methods above are done, you can use the transitive target matx::matx in your library inside of target_link_libraries, e.g:

target_link_libraries(MyProject matx::matx)

MatX may add other optional targets in the future inside the matx:: namespace as well.

Documentation

Documentation for MatX can be built locally as shown above with the DBUILD_DOCS=ON cmake flag. Building documentation requires the following to be installed: doxygen, breathe, sphinx, sphinx-rtd-theme, libjs-mathjax, texlive-font-utils, flex, bison

Current documentation can be found here
A quick start guide can be found here
Current library executor support is listed here
A conversion from MATLAB and Python syntax is found here
A self-guided Jupyer notebook training can be found here

MatX uses semantic versioning and reserve the right to introduce breaking API changes on major releases.

Supported Data Types

MatX supports all types that use standard C++ operators for math (+, -, etc). Unit tests are run against all common types shown below.

Integer: int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t
Floating Point: matxFp16 (fp16), matxBf16 (bfloat16), float, double
Complex: matxfp16Complex, matxBf16Complex, cuda::std::complex<float>, cuda::std::complex<double>

Since CUDA half precision types (__half and __nv_bfloat16) do not support all C++ operators on the host side, MatX provides the matxFp16 and matxBf16 types for scalars, and matxFp16Complex and matxBf16Complex for complex types. These wrappers are needed so that tensor views can be evaluated on both the host and device, regardless of CUDA or hardware support. When possible, the half types will use hardware- accelerated intrinsics automatically. Existing code using __half and __nv_bfloat16 may be converted to the matx equivalent types directly and leverage all operators.

Unit Tests

MatX contains a suite of unit tests to test functionality of the primitive functions, plus end-to-end tests of example code. MatX uses pybind11 to generate some of the unit test inputs and outputs. This avoids the need to store large test vector files in git, and instead can be generated as-needed.

To run the unit tests, from the cmake build directory run: sh make -j test

This will execute all unit tests defined. It is also possible to build and execute a single test, for example: make test_00_operators_interp_test test/test_00_operators_interp_test

To run a subset of tests, it is possible to use ctest from inside the build/test directory. For example, to run only tests with the name FFT:

sh cd build/test ctest -R "FFT"

Quick Start Guide

We provide a variety of training materials and examples to quickly learn the MatX API. - A quick start guide can be found in the docs directory or from the main documentation site. The MatX quick start guide is modeled after NumPy's and demonstrates how to manipulate and create tensors. - A set of MatX notebooks can be found in the docs directory. These four notebooks walk through the major MatX features and allow the developer to practice writing MatX code with guided examples and questions. - Finally, for new MatX developers, browsing the example applications can provide familarity with the API and best practices.

Release Major Features

v0.9.1: - New operators: argminmax, dense2sparse, sparse2dense, interp1, normalize, argsort - Removed requirement for --relaxed-constexpr - Added MatX NVTX domain - Significantly improved speed of svd and inv - Python integration sample - Experimental sparse tensor support (SpMM and solver routines supported) - Significantly reduced FFT memory usage

v0.9.0: - Features * Full CPU support for both ARM and x86 on all solver, BLAS, and FFT functions, including multi-threaded support * New vectornorm and matrixnorm operators - Bug fixes * Many host and device compiler fixes and workarounds * Performance improvements in nested transforms

v0.8.0: - Features * Updated cuTENSOR and cuTensorNet versions * Added configurable print formatting * ARM FFT support via NVPL * New operators: abs2(), outer(), isnan(), isinf() * Many more unit tests for CPU tests - Bug fixes for matmul on Hopper, 2D FFTs, and more

v0.7.0: - Features * Automatic documentation generation * Use CCCL instead of CUB/libcudac++ * New operators: polyval, matvec * Improved caching and teardown of transforms * Optimized polyphase resampler * Negative slice indexing - Many new bug fixes and error checking

Discussions

We have an open discussions board here. We encourage any questions about the library to be posted here for other users to learn from and read through.

Filing Issues

We welcome and encourage the creation of issues against MatX. When creating a new issue, please use the following syntax in the title of your submission to help us prioritize responses and planned work. * Bug Report: Append [BUG] to the beginning of the issue title, e.g. [BUG] MatX fails to build on P100 GPU * Documentation Request: Append [DOC] to the beginning of the issue title * Feature Request: Append [FEA] to the beginning of the issue title * Submit a Question: Append [QST] to the beginning of the issue title

As with all issues, please be as verbose as possible and, if relevant, include a test script that demonstrates the bug or expected behavior. It's also helpful if you provide environment details about your system (bare-metal, cloud GPU, etc).

Contributing Guide

Please review the CONTRIBUTING.md file for information on how to contribute code and issues to MatX. We require all pull requests to have a linear history and rebase to main before merge.

Owner

Name: NVIDIA Corporation
Login: NVIDIA
Kind: organization
Location: 2788 San Tomas Expressway, Santa Clara, CA, 95051

Website: https://nvidia.com
Repositories: 342
Profile: https://github.com/NVIDIA

Citation (CITATION.cff)

cff-version: 1.2.0
message: "Thank you for using MatX. Please cite it in your work as described below."
authors:
- family-names: "Burdick"
  given-names: "Cliff"
  orcid: "https://orcid.org/0000-0002-7860-0570"
- family-names: "Luitjens"
  given-names: "Justin"
  orcid: "https://orcid.org/0000-0002-8787-8785"
- family-names: "Thompson"
  given-names: "Adam"
  orcid: "https://orcid.org/0000-0001-9690-6357"
title: "MatX Primitives Library for GPU-Accelerated Numerical Computing in C++"
version: 0.6.0
date-released: 2023-10-02
url: "https://github.com/NVIDIA/matx"

Committers

Last synced: 12 months ago

All Time

Total Commits: 839
Total Committers: 31
Avg Commits per committer: 27.065
Development Distribution Score (DDS): 0.378

Past Year

Commits: 259
Committers: 16
Avg Commits per committer: 16.188
Development Distribution Score (DDS): 0.456

Top Committers

Name	Email	Commits
Cliff Burdick	3****k	522
Justin Luitjens	l****s	86
tbensonatl	1****l	45
Tim Martin	3****h	34
Aart Bik	a**k@n**m	33
Tyler Allen	1****a	33
Aayush Gupta	1****5	18
Jonathan Wong	1****g	12
AtomicVar	g**1@f**m	11
Adam Thompson	3****p	9
kshitij12345	k**r@g**m	7
Allison Piper	a**6@g**m	4
Hugo Phibbs	8****s	3
Mike Mullen	9****n	3
Pierre Kestener	p****e	2
eschmidt-nvidia	9****a	2
hugo-syn	6****n	1
bhaskarrakshit	1****t	1
Yaraslau	4****t	1
Sergei Nikolaev	d****v	1
Leo Fang	l**2@g**m	1
Julien Jomier	2****r	1
Julien Bernard	r**b@g**m	1
Gabriel Wu	q**4@1**m	1
David Gardner	9****v	1
Daniel Galvez	g****v	1
Christopher Harris	x**a@g**m	1
Bryce Adelstein Lelbach aka wash	b**h@g**m	1
Boris Bonev	b**s@g**m	1
Ben Barsdell	b**l@g**m	1
and 1 more...

Committer Domains (Top 20 + Academic)

plasma.ninja: 1 126.com: 1 foxmail.com: 1 nvidia.com: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 112
Total pull requests: 597
Average time to close issues: 2 months
Average time to close pull requests: 6 days
Total issue authors: 27
Total pull request authors: 25
Average comments per issue: 2.53
Average comments per pull request: 2.37
Merged pull requests: 483
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 46
Pull requests: 397
Average time to close issues: 9 days
Average time to close pull requests: 2 days
Issue authors: 17
Pull request authors: 15
Average comments per issue: 1.2
Average comments per pull request: 2.31
Merged pull requests: 328
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

cliffburdick (30)
HugoPhibbs (18)
tmartin-gh (12)
deanljohnson (9)
luitjens (7)
tylera-nvidia (7)
lucifer1004 (5)
mfzmullen (3)
raplonu (2)
jwmelto (2)
cenwangumass (1)
turbotage (1)
DanaKimball-VTS (1)
simonbyrne (1)
a5xwin (1)

Pull Request Authors

cliffburdick (254)
aartbik (119)
tmartin-gh (53)
tbensonatl (33)
aayushg55 (31)
simonbyrne (16)
tylera-nvidia (16)
nvjonwong (15)
luitjens (12)
alliepiper (9)
mfzmullen (8)
HugoPhibbs (8)
dylan-eustice (3)
ahmedhus22 (3)
AtomicVar (3)

Top Labels

Issue Labels

bug (5) enhancement (1) good first issue (1)

Pull Request Labels

enhancement (4) bug (2)

Dependencies

.github/workflows/blossom-ci.yml actions

NVIDIA/blossom-action main composite
actions/checkout v2 composite

cmake/rapids-cmake/rapids-cmake/cpm/versions.json meteor

.github/workflows/build-docs.yml actions

actions/checkout v3 composite
actions/configure-pages v3 composite
actions/deploy-pages v2 composite
actions/upload-pages-artifact v2 composite

matx

Science Score: 44.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

MatX - GPU-Accelerated Numerical Computing in Modern C++

Table of Contents

Requirements

Installation

Building MatX

Integrating MatX With Your Own Projects

1. MatX as a Subdirectory

2. MatX Installed to the System

MatX CMake Targets

Documentation

Supported Data Types

Unit Tests

Quick Start Guide

Release Major Features

Discussions

Filing Issues

Contributing Guide

Owner

Citation (CITATION.cff)

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies