https://github.com/anjiang-wei/ptx_dataset

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.6%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: Anjiang-Wei
Language: Python
Default Branch: main
Size: 1.91 MB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created 10 months ago · Last pushed 10 months ago

Metadata Files

Readme

PTX_dataset

Mirage

Example Equivalent CUDA code

CUDA folder

Example Equivalent PTX code (all equivalent)

PTX folder

They are all equivalent, based on different schedules explored by superoptimization.

Generation method

First generate CUDA code based on the saved schedule for GQA kernel according to the AE doc python3 $MIRAGE_ROOT/benchmark/group_query_attention.py --file $MIRAGE_ROOT/benchmark/saved_mugraphs/gqa_bs1.json

Then lower to PTX with this script

Cutlass

Example Equivalent Pair of PTX

GEMM 1 GEMM 2

Example Generated CUDA code

CUDA Folder

but some may be MM with transpose, need to take a look at the filenames

Example PTX code

PTX Folder

Generation method

When building Cutlass profiler, a lot of template will be instantiated with different parameters. During runtime, Cutlass profiler can thus search for many equivalent versions to find the best configuration https://github.com/Anjiang-Wei/cutlassptx/blob/main/media/docs/cpp/profiler.md ``` mkdir build cd build cmake .. -DCUTLASSNVCCARCHS="80" -DCUTLASSLIBRARYKERNELS=gemm -DCUTLASSUNITYBUILDENABLED=ON make cutlass_profiler -j ``During compilation, the.cufiles are saved inbuild/tools/library/generated/gemm. Then I create a script to compile those.cu` files into PTX.

Usage: cd build ./generate_ptx.py -j 20 --arch 80 -v