https://github.com/coreylammie/accelerating-stochastically-binarized-neural-networks-on-fpgas-using-opencl

2019 IEEE International Midwest Symposium on Circuits and Systems: Accelerating Stochastically Binarized Neural Networks on FPGAs using OpenCL

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.5%) to scientific vocabulary

Last synced: 7 months ago · JSON representation

Repository

2019 IEEE International Midwest Symposium on Circuits and Systems: Accelerating Stochastically Binarized Neural Networks on FPGAs using OpenCL

Basic Info

Host: GitHub
Owner: coreylammie
Language: Python
Default Branch: master
Homepage:
Size: 8.82 MB

Statistics

Stars: 5
Watchers: 3
Forks: 1
Open Issues: 0
Releases: 0

Created about 7 years ago · Last pushed over 6 years ago

Metadata Files

Readme

Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL

GitHub repository detailing the network architectures and implementation details for 'Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL', available here, to be presented at the 62nd IEEE International Midwest Symposium on Circuits and Systems (MWSCAS), 2019.

Abstract

Recent technological advances have proliferated the available computing power, memory, and speed of modern Central Processing Units (CPUs), Graphics Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs). Consequently, the performance and complexity of Artificial Neural Networks (ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs) currently offer state-of-the-art performance, they consume large amounts of power. Training such networks on CPUs is inefficient, as data throughput and parallel computation is limited. FPGAs are considered a suitable candidate for performance critical, low power systems, e.g. the Internet of Things (IOT) edge devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development environment, networks described using the high level OpenCL framework can be accelerated on heterogeneous platforms. Moreover, the resource utilization and power consumption of DNNs can be further enhanced by utilizing regularization techniques that binarize network weights. In this paper, we introduce, to the best of our knowledge, the first FPGA-accelerated stochastically binarized DNN implementations, and compare them to implementations accelerated using both GPUs and FPGAs. Our developed networks are trained and benchmarked using the popular MNIST and CIFAR-10 datasets, and achieve near state-of-the-art performance, while offering a >16-fold improvement in power consumption, compared to conventional GPU-accelerated networks. Both our FPGA-accelerated deterministic and stochastic BNNs reduce inference times on MNIST and CIFAR-10 by >9.89x and >9.91x, respectively.

Network Architectures

Two distinct Neural Network (NN) architectures were implemented employing deterministic, stochastic, and no regularization techniques: A permutation-invarient FC DNN for MNIST, and the VGG-16 Convolutional Neural Network (CNN) for CIFAR-10.

Permutation-invarient FC DNN for MNIST

| Layer (type) | Output Shape | Parameters | |--------------------------------------------|--------------|------------| | Linear(infeatures=784, outfeatures=1000) | [-1, 1000] | 785,000 | | BatchNorm1d(1000) | [-1, 1000] | 2,000 | | ReLU() | [-1, 1000] | 0 | | Linear(infeatures=1000, outfeatures=500) | [-1, 500] | 500,500 | | BatchNorm1d(500) | [-1, 500] | 1,000 | | ReLU() | [-1, 500] | 0 | | Linear(infeatures=500, outfeatures=10) | [-1, 10] | 5,010 | | Softmax() | [-1, 10] | 0 | | Total Parameters: 1,293,510 | | |

VGG CNN for CIFAR-10

| Layer (type) | Output Shape | Parameters | |----------------------------------------------------------|-------------------|------------| | Conv2d(inchannels=3, outchannels=128, kernelsize=3) | [-1, 128, 32, 32] | 3,584 | | BatchNorm2d(128) | [-1, 128, 32, 32] | 256 | | ReLU() | [-1, 128, 32, 32] | 0 | | Conv2d(inchannels=128, outchannels=128, kernelsize=3) | [-1, 128, 32, 32] | 147,584 | | MaxPool2d(kernelsize=2, stride=2) | [-1, 128, 16, 16] | 0 | | BatchNorm2d(128) | [-1, 128, 16, 16] | 256 | | ReLU() | [-1, 128, 16, 16] | 0 | | Conv2d(inchannels=128, outchannels=256, kernelsize=3) | [-1, 256, 16, 16] | 295,168 | | BatchNorm2d(256) | [-1, 256, 16, 16] | 512 | | ReLU() | [-1, 256, 16, 16] | 0 | | Conv2d(inchannels=256, outchannels=256, kernelsize=3) | [-1, 256, 16, 16] | 590,080 | | MaxPool2d(kernelsize=2, stride=2) | [-1, 256, 16, 16] | 0 | | BatchNorm2d(256) | [-1, 256, 16, 16] | 512 | | ReLU() | [-1, 256, 16, 16] | 0 | | Conv2d(inchannels=256, outchannels=512, kernelsize=3) | [-1, 512, 16, 16] | 1,180,160 | | BatchNorm2d(512) | [-1, 512, 16, 16] | 1,024 | | ReLU() | [-1, 512, 16, 16] | 0 | | Conv2d(inchannels=512, outchannels=512, kernelsize=3) | [-1, 512, 16, 16] | 2,359,808 | | MaxPool2d(kernelsize=2, stride=2) | [-1, 512, 8, 8] | 0 | | BatchNorm2d(512) | [-1, 512, 8, 8] | 1,024 | | ReLU() | [-1, 512, 8, 8] | 0 | | Linear(infeatures=8192, outfeatures=1024) | [-1, 1024] | 8,389,632 | | BatchNorm1d(1024) | [-1, 1024] | 2,048 | | ReLU() | [-1, 1024] | 0 | | Linear(infeatures=1024, outfeatures=1024) | [-1, 1024] | 1,049,600 | | BatchNorm1d(1024) | [-1, 1024] | 2,048 | | ReLU() | [-1, 1024] | 0 | | Linear(infeatures=1024, out_features=10) | [-1, 10] | 10,250 | | Softmax() | [-1, 10] | 0 | | Total Parameters: 14,033,546 | | |

Implementations

We provide the exported parameters of all GPU-trained BNNs to reproduce our results using the PyTorch library. All dependencies can be installed using:

~~~~ pip -r install requirements.txt ~~~~

where requirements.txt is available here.

~~~~ python Test.py --batchsize 256 --dataset MNIST --trainedmodel "Trained Models/MNISTStochastic.pt" python Test.py --batchsize 256 --dataset MNIST --trainedmodel "Trained Models/MNISTDeterministic.pt"

wget https://www.coreylammie.me/mwscas2019/CIFAR-10Stochastic.pt wget https://www.coreylammie.me/mwscas2019/CIFAR-10Deterministic.pt python Test.py --batchsize 256 --dataset CIFAR-10 --trainedmodel "CIFAR10Stochastic.pt" python Test.py --batchsize 256 --dataset CIFAR-10 --trainedmodel "CIFAR10Deterministic.pt" ~~~~

Citation

To cite the paper, kindly use the following BibTex entry:

@article{DBLP:journals/corr/abs-1905-06105, author = {Corey Lammie and Wei Xiang and Mostafa Rahimi Azghadi}, title = {Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL}, journal = {CoRR}, volume = {abs/1905.06105}, year = {2019}, url = {http://arxiv.org/abs/1905.06105}, archivePrefix = {arXiv}, eprint = {1905.06105}, timestamp = {Tue, 28 May 2019 12:48:08 +0200}, biburl = {https://dblp.org/rec/bib/journals/corr/abs-1905-06105}, bibsource = {dblp computer science bibliography, https://dblp.org} }

License

All code is licensed under the GNU General Public License v3.0. Details pertaining to this are available at: https://www.gnu.org/licenses/gpl-3.0.en.html

Owner

Name: Corey Lammie
Login: coreylammie
Kind: user
Location: Zürich Switzerland

Website: coreylammie.me
Twitter: coreylammie
Repositories: 3
Profile: https://github.com/coreylammie

Electrical Engineer & Computer Engineering PhD Candidate 👨‍🎓

GitHub Events

Total

Last Year

Issues and Pull Requests

Last synced: 12 months ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

requirements.txt pypi

numpy >=1.15.4
pandas >=0.23.4
torch >=1.0.0
torchnet >=0.0.4
torchvision >=0.2.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/coreylammie/accelerating-stochastically-binarized-neural-networks-on-fpgas-using-opencl

Science Score: 23.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL

Abstract

Network Architectures

Permutation-invarient FC DNN for MNIST

VGG CNN for CIFAR-10

Implementations

Citation

License

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies