Science Score: 72.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
✓Committers with academic emails
1 of 7 committers (14.3%) from academic institutions -
✓Institutional organization owner
Organization spcl has institutional domain (spcl.inf.ethz.ch) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary
Keywords
Repository
A Data-Centric Compiler for Machine Learning
Basic Info
- Host: GitHub
- Owner: spcl
- License: bsd-3-clause
- Language: Python
- Default Branch: master
- Homepage: https://daceml.readthedocs.io
- Size: 4.39 MB
Statistics
- Stars: 84
- Watchers: 8
- Forks: 14
- Open Issues: 22
- Releases: 0
Topics
Metadata Files
README.md
DaCeML
Machine learning powered by data-centric parallel programming.
This project adds PyTorch and ONNX model loading support to DaCe, and adds ONNX operator library nodes to the SDFG IR. With access to DaCe's rich transformation library and productive development environment, DaCeML can generate highly efficient implementations that can be executed on CPUs, GPUs and FPGAs.
The white box approach allows us to see computation at all levels of granularity: from coarse operators, to kernel implementations, and even down to every scalar operation and memory access.
DaCeML can be used to achieve state of the art GPU performance on highly contested layers, such as BERT-fp16 or EfficientNet-B0. For more details, and other performance results, please see our ICS'22 publication.

Read more: Library Nodes
Integration
Converting PyTorch modules is as easy as adding a decorator... ```python @dacemodule class Model(nn.Module): def _init(self, kernel_size): super().init_() self.conv1 = nn.Conv2d(1, 4, kernelsize) self.conv2 = nn.Conv2d(4, 4, kernel_size)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
... and ONNX models can also be directly imported using the model loader:
python
model = onnx.load(modelpath)
dacemodel = ONNXModel("mymodel", model)
```
Read more: PyTorch Integration and Importing ONNX models.
Training
DaCeML modules support training using a symbolic automatic differentiation engine: ```python import torch.nn.functional as F from daceml.torch import dace_module
@dacemodule(backward=True) class Net(nn.Module): def _init(self): super().init__() self.fc1 = nn.Linear(784, 120) self.fc2 = nn.Linear(120, 32) self.fc3 = nn.Linear(32, 10) self.ls = nn.LogSoftmax(dim=-1)
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
x = self.ls(x)
return x
x = torch.randn(8, 784) y = torch.tensor([0, 1, 2, 3, 4, 5, 6, 7], dtype=torch.long)
model = Net()
criterion = nn.NLLLoss() prediction = model(x) loss = criterion(prediction, y)
gradients can flow through model!
loss.backward() ```
Read more: Automatic Differentiation.
Library Nodes
DaCeML extends the DaCe IR with machine learning operators. The added nodes perform computation as specificed by the ONNX specification. DaCeML leverages high performance kernels from ONNXRuntime, as well as pure SDFG implementations that are introspectable and transformable with data centric transformations.
The nodes can be used from the DaCe python frontend. ```python import dace import daceml.onnx as donnx import numpy as np
@dace.program def convprogram(Xarr: dace.float32[5, 3, 10, 10], Warr: dace.float32[16, 3, 3, 3]): output = dace.definelocal([5, 16, 4, 4], dace.float32) donnx.ONNXConv(X=Xarr, W=Warr, Y=output, strides=[2, 2]) return output
X = np.random.rand(5, 3, 10, 10).astype(np.float32) W = np.random.rand(16, 3, 3, 3).astype(np.float32)
result = convprogram(Xarr=X, W_arr=W) ```
Setup
The easiest way to get started is to run
make install
This will setup DaCeML in a newly created virtual environment.
For more detailed instructions, including ONNXRuntime installation, see Installation.
Development
Common development tasks are automated using the Makefile.
See Development for more information.
Citing
If you use DaCeML, please cite us:
bibtex
@inproceedings{daceml,
author = {Rausch, Oliver and Ben-Nun, Tal and Dryden, Nikoli and Ivanov, Andrei and Li, Shigang and Hoefler, Torsten},
title = {{DaCeML}: A Data-Centric Optimization Framework for Machine Learning},
year = {2022},
booktitle = {Proceedings of the 36th ACM International Conference on Supercomputing},
series = {ICS '22}
}
Owner
- Name: SPCL
- Login: spcl
- Kind: organization
- Website: https://spcl.inf.ethz.ch/
- Repositories: 100
- Profile: https://github.com/spcl
Citation (CITATION.cff)
cff-version: 1.2.0
title: "DaCeML - A Data-Centric Optimization Framework for Machine Learning"
message: "Please cite as"
authors:
- family-names: Rausch
given-names: Oliver
- family-names: Ben-Nun
given-names: Tal
- family-names: Dryden
given-names: Nikoli
- family-names: Ivanov
given-names: Andrei
- family-names: Li
given-names: Shigang
- family-names: Hoefler
given-names: Torsten
- family-names: De Matteis
given-names: Tiziano
- family-names: Burger
given-names: Manuel
preferred-citation:
title: "DaCeML: A Data-Centric Optimization Framework for Machine Learning"
doi: "10.1145/3524059.3532364"
year: "2022"
type: conference-paper
collection-title: "Proceedings of the 36th ACM International Conference on Supercomputing"
conference:
name: "ICS '22"
authors:
- family-names: Rausch
given-names: Oliver
- family-names: Ben-Nun
given-names: Tal
- family-names: Dryden
given-names: Nikoli
- family-names: Ivanov
given-names: Andrei
- family-names: Li
given-names: Shigang
- family-names: Hoefler
given-names: Torsten
GitHub Events
Total
- Watch event: 3
- Fork event: 1
Last Year
- Watch event: 3
- Fork event: 1
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Oliver Rausch | o****9@g****m | 538 |
| Tal Ben-Nun | t****n@g****m | 55 |
| Manuel Burger | b****m@s****h | 33 |
| Shigang Li | s****s@g****m | 19 |
| am-ivanov | a****v | 8 |
| Tiziano De Matteis | 5****s | 1 |
| Julia Bazińska | j****a@b****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 33
- Total pull requests: 99
- Average time to close issues: 8 months
- Average time to close pull requests: 27 days
- Total issue authors: 7
- Total pull request authors: 7
- Average comments per issue: 0.45
- Average comments per pull request: 1.19
- Merged pull requests: 71
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- orausch (17)
- tbennun (10)
- rohanrayan (2)
- Shbinging (1)
- HeyDavid633 (1)
- vselhakim1337 (1)
- ruck314 (1)
Pull Request Authors
- orausch (75)
- tbennun (11)
- manuelburger (7)
- lamyiowce (2)
- Shigangli (1)
- TizianoDeMatteis (1)
- and-ivanov (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Sphinx ==3.2.1
- matplotlib ==3.4.2
- sphinx-autodoc-typehints ==1.11.1
- sphinx-gallery ==0.9.0
- sphinx-rtd-theme ==0.5.2
- dace *
- dataclasses *
- onnx *
- onnx-simplifier *
- protobuf *
- torch *
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/upload-artifact v2 composite
- actions/checkout v2 composite
- actions/upload-artifact v2 composite
- actions/checkout v2 composite
- actions/upload-artifact v2 composite
- actions/checkout v2 composite
- actions/checkout v2 composite