gradbench
Benchmarks for differentiable programming across languages and domains.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.3%) to scientific vocabulary
Repository
Benchmarks for differentiable programming across languages and domains.
Basic Info
- Host: GitHub
- Owner: gradbench
- License: mit
- Language: C++
- Default Branch: main
- Homepage: https://gradben.ch
- Size: 834 MB
Statistics
- Stars: 47
- Watchers: 5
- Forks: 8
- Open Issues: 120
- Releases: 0
Metadata Files
README.md
GradBench
GradBench is a benchmark suite for differentiable programming across languages and domains.
See https://gradben.ch for interactive performance charts generated from our latest nightly build. Here's a static preview of the overview table on the website, where rows are evals and columns are tools.
- A grey cell means the tool did not successfully complete that eval.
- A white cell means the tool is slow for that eval.
- A blue cell means the tool is fast for that eval.
Contents
Motivation
Automatic differentiation (often shortened as "AD" or "autodiff") and differentiable programming allow a programmer to write code to compute a mathematical function, and then automatically provide code to compute the derivative of that function. These techniques are currently ubiquitous in machine learning, but are broadly applicable in a much larger set of domains in scientific computing and beyond. Many autodiff tools exist, for many different programming languages, with varying feature sets and performance characteristics.
This project exists to facilitate quantitative comparison of the absolute and relative performance of different autodiff tools. There is some related work in this space:
- The 2016 paper "Efficient Implementation of a Higher-Order Language with Built-In AD" by Siskind and Pearlmutter links to two benchmarks implemented for a variety of tools, mostly in Scheme.
- ADBench was an autodiff benchmark suite, active around 2018-2019, but is now archived as of summer 2024.
- cmpad is an autodiff comparison package for C++ and Python.
The evals in GradBench are a strict superset of all those benchmarks. What really sets this project apart is the focus on supporting tools for many different programming languages in an easily extensible way. We achieve this by packaging each eval and tool into its own Docker image, and running benchmarks by having the eval and tool talk to each other over a common JSON-based protocol. We also make our benchmarks and data as easily accessible as possible, via nightly builds that publish our Docker images and run every eval against every tool to generate performance charts on the GradBench website.
Usage
If you haven't already, take a look at the website! We generate daily charts visualizing the performance of all the different tools (columns) on all the different evals (rows). You can click on the name of a specific eval to see more detailed charts plotting the performance of each tool on that eval across a variety of different workload sizes.
To go beyond just the data that has already been generated, here are instructions on how to run the benchmarks yourself.
Running GradBench locally
If you'd like to run GradBench locally using this Git repository, first clone it; for instance, if you have the GitHub CLI installed:
sh
gh repo clone gradbench/gradbench
cd gradbench
Make sure you have the following tools available on your system:
All the command-line scripts for working with GradBench are packaged into the
GradBench CLI, which you can run using the ./gradbench script
at the root of this repository. For example, you can use the following command
to run PyTorch on our simplest eval:
sh
./gradbench repo run --eval hello --tool pytorch -o run
You should see a bunch of green and blue and magenta build output, followed by something like this:
running eval hello
with tool pytorch
[0] start hello (pytorch)
[1] def hello 1.948 s ✓
[2] eval hello::square 1.0 8ms ~ 2ms evaluate ✓
[4] eval hello::double 1.0 7ms ~ 6ms evaluate ✓
[6] eval hello::square 2.0 0ms ~ 0ms evaluate ✓
[8] eval hello::double 4.0 0ms ~ 0ms evaluate ✓
[10] eval hello::square 8.0 0ms ~ 0ms evaluate ✓
[12] eval hello::double 64.0 0ms ~ 0ms evaluate ✓
[14] eval hello::square 128.0 0ms ~ 0ms evaluate ✓
[16] eval hello::double 16384.0 0ms ~ 0ms evaluate ✓
outcome success
Congrats, this means everything worked correctly! The raw message log has been
stored in run/hello/pytorch.jsonl in the JSON Lines format, such that each
line is a valid JSON object. The file consists of message/response pairs sent
from the message and received from the tool, and can be analysed using other
scripts. Since a log file contains all inputs and outputs, it can be quite
large.
Now you can try running other combinations from our set of available
evals and tools. For instance, here's an example running the
hello eval with all tools (except one which doesn't work on ARM; feel free
to include it if your machine is x86), putting the log files into the same
run/hello directory as before:
sh
./gradbench repo run --eval hello --no-tool scilean -o run
This was just a quickstart summary; see CONTRIBUTING.md for
more details. You can also pass --help to any command or subcommand to see
other possible options:
sh
./gradbench repo run --help
Without using Docker
The --eval and --tool options passed to the repo run subcommand use named
evals and tools in this repository by default, but they can also take arbitrary
shell commands when prefixed with a $, so the default use of Docker is merely
a convenience. It is possible to run GradBench without using Docker, although it
requires you to set up the necessary dependencies on your system. This section
describes how to do that.
While the dependencies required for the evals are somewhat restrained, tool
dependencies can be very diverse and difficult to install. Details are provided
below. If you use Nix or NixOS, then the shell.nix provides an
easy way to install the dependencies needed for most evals and tools.
Running evals outside of Docker
As of this writing, all evals are written in Python, and depend on Python
packages that must be made available. Further, many evals perform validation by
comparing against the manual tool. Before running these evals, you must
compile manual, like so:
sh
make -C cpp
make -C tools/manual
This requires you to have a functioning C++ compiler, but manual does not
otherwise have any dependencies.
Using uv
The easiest way to run GradBench's Python code is to install a sufficiently recent version of uv (0.6.8 works as of this writing), which is a Python package manager. Once this is done, an eval can be run with e.g.:
sh
uv run python/gradbench/gradbench/evals/hello/run.py
You should see just one line of output:
json
{ "id": 0, "kind": "start", "eval": "hello" }
At this point the eval will hang, as it waits for a response from the tool. Just
terminate it with Ctrl-c or Ctrl-d - if you see the above, then the eval
likely works.
Not using uv
You can run Python code without uv by manually installing the dependencies (or
by using another package manager, such as pip). The file
pyproject.toml lists the dependencies required by all
tools, but evals need only a subset of these. Specifically, the following are
required:
numpypydanticdataclasses-json
You may want to install these in a virtualenv.
When not using uv, your PYTHONPATH must manually be set to include
python/gradbench. For example, we can run the hello eval manually as
follows:
sh
PYTHONPATH=python/gradbench/:$PYTHONPATH python3 python/gradbench/gradbench/evals/hello/run.py
Running tools outside of Docker
Each tool README should document how to run that tool outside of Docker, which
may require installing dependencies or setting environment variables. For some
tools that can be quite challenging. However, there is also some commonality
between related tools. When the documentation is insufficient, you can always
look at the Dockerfile to see exactly what needs to be installed.
Running C++-based tools
Each C++ tool is structured with one executable per eval. They expect to find
their includes and libraries through standard mechanisms such as pkg-config or
by having environment variables such as CPATH/LD_LIBRARY_PATH/LIBRARY_PATH
set appropriately. Further, they expect some libraries to be available in the
cpp directory, which can be achieved with:
sh
make -C cpp
The executable for a tool foo for eval bar is compiled with
sh
make -C tools/foo bin/bar
However, you do not need to do this in advance - compilation is done by a Python
module cpp.py that implements the GradBench protocol and runs the executables
(except for manual, see above).
Specifically, to run tool foo we would do:
sh
uv run python/gradbench/gradbench/cpp.py foo
This will seem to hang because it is waiting for a message from the eval. You
can use the command above as the --tool option to the gradbench CLI. In
fact, as of this writing cpp.py does does depend on any non-builtin Python
module, so you can run it without uv or fiddling with PYTHONPATH:
sh
python3 python/gradbench/gradbench/cpp.py foo
Putting it all together, we can run the hello eval with the manual tool as
follows:
sh
./gradbench repo run --eval "$ uv run python/gradbench/gradbench/evals/hello/run.py" --tool "$ python3 python/gradbench/gradbench/cpp.py manual"
Or without using uv:
sh
PYTHONPATH=python/gradbench/:$PYTHONPATH ./gradbench repo run --eval "$ python3 python/gradbench/gradbench/evals/hello/run.py" --tool "$ python3 python/gradbench/gradbench/cpp.py manual"
You can also run the C++ executables completely separately from GradBench if you wish.
This does require you to first extract the raw input from a gradbench log
file.
Multithreading
By default, tools use only a single thread. You can ask them to use multiple
(CPU) threads, if possible, by passing the option --multithreaded. Example:
sh
./gradbench repo run --eval gmm --tool 'manual --multithreaded'
Multithreading is still a somewhat experimental feature. Many tools may still use only a single thread. Some tools may be able to multithread their primal code, but not their differentiated code. Some tools may fail to work entirely. An eval documents to which extent it can be parallelised in its Commentary, and a tool similarly documents which of its implementations have been parallelised in its Commentary.
The following tools have at least partial support for multithreaded execution of their differentiated functions:
Without cloning this repository
[!WARNING]
Only use this method if you have a specific reason not to use the primary method documented above.
It's also possible to install and run the GradBench CLI without cloning this
repository, if you'd prefer. In this case you don't need Python but you still
need Rust and Docker. Use cargo install with the --git flag (note that
this command only installs GradBench once; to update, you'll need to re-run it):
sh
cargo install --locked gradbench --git https://github.com/gradbench/gradbench --branch nightly
Then, you can use the newly installed gradbench CLI to download and run our
nightly Docker images. For instance, if you have jq installed,
you can run these commands to grab the date of the most recent successful
nightly build, then download and run those images for the hello eval and the
pytorch tool:
sh
DATE=$(curl https://raw.githubusercontent.com/gradbench/gradbench/refs/heads/ci/refs/heads/nightly/summary.json | jq --raw-output .date)
gradbench run --eval "gradbench eval hello --tag $DATE" --tool "gradbench tool pytorch --tag $DATE"
Citing
GradBench is largely developed by academics and we appreciate a citation if you find it useful for published work. See the Cite this repository button in the About section of the right GitHub sidebar, or view CITATION.cff directly.
License
GradBench is licensed under the MIT License. Some implementations are based on work used under other licenses - this is clearly noted at the top of a file, along with attribution, when applicable. All files are available under OSI-approved licenses.
Citation (CITATION.cff)
cff-version: 1.2.0
title: GradBench
message: >-
If you use this software, please cite it using the metadata from this file.
type: software
authors:
- given-names: Sam
family-names: Estep
email: sam@samestep.com
orcid: "https://orcid.org/0000-0002-7107-7043"
- given-names: Maggie
family-names: Hollis
email: mhollis@smith.edu
- given-names: Tomáš
family-names: Skřivan
email: skrivantomas@seznam.cz
- given-names: Troels
family-names: Henriksen
email: athas@sigkill.dk
orcid: "https://orcid.org/0000-0002-1195-9722"
- given-names: Valentin
family-names: Churavy
email: v.churavy@gmail.com
orcid: "https://orcid.org/0000-0002-9033-165X"
- given-names: Alexander
family-names: Fleming
email: alexander.fleming@rwth-aachen.de
orcid: "https://orcid.org/0009-0004-8591-2558"
- given-names: Max
family-names: Sagebaum
email: max.sagebaum@scicomp.uni-kl.de
repository-code: "https://github.com/gradbench/gradbench"
url: "https://gradben.ch"
abstract: >-
GradBench is a benchmark suite for differentiable programming across languages
and domains.
keywords:
- autodiff
- benchmarks
license: MIT
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 172
- Total pull requests: 258
- Average time to close issues: 24 days
- Average time to close pull requests: 4 days
- Total issue authors: 5
- Total pull request authors: 8
- Average comments per issue: 0.93
- Average comments per pull request: 2.09
- Merged pull requests: 198
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 172
- Pull requests: 258
- Average time to close issues: 24 days
- Average time to close pull requests: 4 days
- Issue authors: 5
- Pull request authors: 8
- Average comments per issue: 0.93
- Average comments per pull request: 2.09
- Merged pull requests: 198
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- athas (111)
- samestep (61)
- gdalle (3)
- aj-fleming (2)
- lecopivo (1)
Pull Request Authors
- samestep (180)
- athas (164)
- maggiehollis (28)
- lecopivo (9)
- aj-fleming (8)
- vchuravy (3)
- gdalle (3)
- MaxSagebaum (1)
- raph5 (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v4 composite
- actions/deploy-pages v4 composite
- actions/upload-pages-artifact v3 composite
- snok/install-poetry v1 composite
- 201 dependencies
- @types/react ^18 development
- @types/react-dom ^18 development
- @typescript-eslint/eslint-plugin ^7 development
- @typescript-eslint/parser ^7 development
- @vitejs/plugin-react-swc ^3 development
- eslint ^8 development
- eslint-plugin-react-hooks ^4 development
- eslint-plugin-react-refresh ^0.4 development
- prettier ^3 development
- prettier-plugin-organize-imports ^3 development
- typescript >=5 development
- vite ^5 development
- react ^18
- react-dom ^18
- filelock 3.14.0
- fsspec 2024.3.1
- intel-openmp 2021.4.0
- jinja2 3.1.3
- markupsafe 2.1.5
- mkl 2021.4.0
- mpmath 1.3.0
- networkx 3.1
- numpy 1.24.4
- nvidia-cublas-cu12 12.1.3.1
- nvidia-cuda-cupti-cu12 12.1.105
- nvidia-cuda-nvrtc-cu12 12.1.105
- nvidia-cuda-runtime-cu12 12.1.105
- nvidia-cudnn-cu12 8.9.2.26
- nvidia-cufft-cu12 11.0.2.54
- nvidia-curand-cu12 10.3.2.106
- nvidia-cusolver-cu12 11.4.5.107
- nvidia-cusparse-cu12 12.1.0.106
- nvidia-nccl-cu12 2.20.5
- nvidia-nvjitlink-cu12 12.4.127
- nvidia-nvtx-cu12 12.1.105
- sympy 1.12
- tbb 2021.12.0
- torch 2.3.0
- torch 2.3.0+cpu
- triton 2.3.0
- typing-extensions 4.11.0
- numpy ^1
- python >=3.8,<3.11
- torch --- - !ruby/hash:ActiveSupport::HashWithIndifferentAccess version: "^2" source: pypi markers: sys_platform == 'darwin' - !ruby/hash:ActiveSupport::HashWithIndifferentAccess version: "^2" source: pypi markers: platform_machine == 'aarch64' and sys_platform != 'darwin' - !ruby/hash:ActiveSupport::HashWithIndifferentAccess version: "^2" source: pytorch_cpu markers: platform_machine == 'x86_64' and sys_platform != 'darwin'
- anstream 0.6.14
- anstyle 1.0.7
- anstyle-parse 0.2.4
- anstyle-query 1.0.3
- anstyle-wincon 3.0.3
- clap 4.5.4
- clap_builder 4.5.2
- clap_derive 4.5.4
- clap_lex 0.7.0
- colorchoice 1.0.1
- heck 0.5.0
- is_terminal_polyfill 1.70.0
- proc-macro2 1.0.84
- quote 1.0.36
- strsim 0.11.1
- syn 2.0.66
- unicode-ident 1.0.12
- utf8parse 0.2.1
- windows-sys 0.52.0
- windows-targets 0.52.5
- windows_aarch64_gnullvm 0.52.5
- windows_aarch64_msvc 0.52.5
- windows_i686_gnu 0.52.5
- windows_i686_gnullvm 0.52.5
- windows_i686_msvc 0.52.5
- windows_x86_64_gnu 0.52.5
- windows_x86_64_gnullvm 0.52.5
- windows_x86_64_msvc 0.52.5