https://github.com/alixunxing/model-compression
model compression based on pytorch (1、quantization: 16/8/4/2 bits(dorefa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、ternary/binary value(twn/bnn/xnor-net);2、 pruning: normal、regular and group convolutional channel pruning;3、 group convolution structure;4、batch-normalization folding for quantization)
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (2.3%) to scientific vocabulary
Repository
model compression based on pytorch (1、quantization: 16/8/4/2 bits(dorefa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、ternary/binary value(twn/bnn/xnor-net);2、 pruning: normal、regular and group convolutional channel pruning;3、 group convolution structure;4、batch-normalization folding for quantization)
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
https://github.com/alixunxing/model-compression/blob/master/
# model-compression *""* ## **pytorch** - 1(16/8/4/2 bits)/ - 2 - 3(A)BN - 4(bits)BN - 5 ## - 1 - 2WA, W(FP32/16/8/4/2bits, /) A(FP32/16/8/4/2bits, ) - 3/tricksW/W/gradstesaturate_stesoft_steAB-A-C-PC-B-A-Pacc - 4modelfilterN(8,16) - 5batch normalizationmodelBNBN > convwb(A)BNBN > convb)(bits)BN ##  ## - **2019.12.4** - **12.8**(A)(* 0.1) - **12.11** - 12.12 - 12.14:1BN(W/)W/2BN(conv)(bias) - **12.17**() - 12.20(cpugpu()) - **12.27** - 12.298bits10bits16bits - **2020.2.17**1W/2W - **2.18**(A)BN:BNgammaBN - **2.24**/./WbWtAb/models/util_w_t_b_conv.pyConv2d_Qnin_gc.py - **3.1**1google(bits)2BN - **3.23.3**(dorefa)models/util_xx.pyConv2d_Qnin_gc.py - 3.4WbWtAb/bn_folding(A)BNBN(//()) - 3.11WqAq/IAOBNmomentum(0.1 > 0.01),batch,,,acc1% - 3.13 - 4.6W_clip()(models/util_xx.py) ## - python >= 3.5 - torch >= 1.1.0 - torchvison >= 0.3.0 - numpy ## ### #### WFP32//AFP32/ --W --A, WA ``` cd quantization/WbWtAb ``` - WbAb ``` python main.py --W 2 --A 2 ``` - WbA32 ``` python main.py --W 2 --A 32 ``` - WtAb ``` python main.py --W 3 --A 2 ``` - WtA32 ``` python main.py --W 3 --A 32 ``` #### WFP32/16/8/4/2 bitsAFP32/16/8/4/2 bits --Wbits --Abits, WA ``` cd quantization/WqAq cd quantization/IAO ``` ##### dorefa - W16A16 ``` python main.py --Wbits 16 --Abits 16 ``` - W8A8 ``` python main.py --Wbits 8 --Abits 8 ``` - W4A4 ``` python main.py --Wbits 4 --Abits 4 ``` - bits ##### IAO ** --q_type, ; --bn_fold, bn - , bn ``` python main.py --q_type 0 --bn_fold 0 ``` - , bn ``` python main.py --q_type 1 --bn_fold 1 ``` - ### * > > * ``` cd pruning ``` #### ``` python main.py ``` #### -sr , --s (datasetmodel) - nin() ``` python main.py -sr --s 0.0001 ``` - nin_gc() ``` python main.py -sr --s 0.001 ``` #### --percent , --normal_regular (N,filterN), --model model, --save model, - ``` python normal_regular_prune.py --percent 0.5 --model models_save/nin_preprune.pth --save models_save/nin_prune.pth ``` - ``` python normal_regular_prune.py --percent 0.5 --normal_regular 8 --model models_save/nin_preprune.pth --save models_save/nin_prune.pth ``` ``` python normal_regular_prune.py --percent 0.5 --normal_regular 16 --model models_save/nin_preprune.pth --save models_save/nin_prune.pth ``` - ``` python gc_prune.py --percent 0.4 --model models_save/nin_gc_preprune.pth ``` #### --refine model ``` python main.py --refine models_save/nin_prune.pth ``` ### > ** #### > 16/8/4/2 bits ``` cd quantization/WqAq cd quantization/IAO ``` ##### W8A8 - nin() ``` python main.py --Wbits 8 --Abits 8 --refine ../../../prune/models_save/nin_refine.pth ``` - nin_gc() ``` python main.py --Wbits 8 --Abits 8 --refine ../../../prune/models_save/nin_gc_refine.pth ``` ##### bits #### > / ``` cd quantization/WbWtAb ``` ##### WbAb - nin() ``` python main.py --W 2 --A 2 --refine ../../prune/models_save/nin_refine.pth ``` - nin_gc() ``` python main.py --W 2 --A 2 --refine ../../prune/models_save/nin_gc_refine.pth ``` ##### ### BN ``` cd quantization/WbWtAb/bn_folding ``` --W W(W(FP32//)) #### model - Wb ``` python bn_folding.py --W 2 ``` - Wt ``` python bn_folding.py --W 3 ``` #### model ``` python bn_folding_model_test.py ``` ### *cpugpu()* --cpu cpu--gpu_id gpu - cpu ``` python main.py --cpu ``` - gpu ``` python main.py --gpu_id 0 ``` ``` python main.py --gpu_id 1 ``` - gpu ``` python main.py --gpu_id 0,1 ``` ``` python main.py --gpu_id 0,1,2 ``` ** ## *cifar10* | | Acc | GFLOPs | Para(M) | Size(MB) | | | | | :-------------------------: | :----: | :----: | :-----: | :------: | :----: | :---: | :------------------------------------------------: | | nin | 91.01% | 0.15 | 0.67 | 2.68 | *** | *** | | | (nin_gc) | 90.88% | 0.15 | 0.58 | 2.32 | 13.43% | 0.13% | | | | 90.26% | 0.09 | 0.32 | 1.28 | 44.83% | 0.62% | | | (W/A) | 90.02% | *** | *** | 0.18 | 92.21% | 0.86% | W | | (W/A) | 87.68% | *** | *** | 0.26 | 88.79% | 3.20% | W, | | +(W/A) | 86.13% | *** | *** | 0.19 | 91.81% | 4.75% | W, | | ++(W/A) | 86.13% | *** | *** | 0.19 | 92.91% | 4.88% | W, | *nin_gc nin nin_gc * ## ### #### - [BinarizedNeuralNetworks: TrainingNeuralNetworkswithWeightsand ActivationsConstrainedto +1 or1](https://arxiv.org/abs/1602.02830) - [XNOR-Net:ImageNetClassicationUsingBinary ConvolutionalNeuralNetworks](https://arxiv.org/abs/1603.05279) - [AN EMPIRICAL STUDY OF BINARY NEURAL NETWORKS OPTIMISATION](https://openreview.net/forum?id=rJfUCoR5KX) - [A Review of Binarized Neural Networks](https://www.semanticscholar.org/paper/A-Review-of-Binarized-Neural-Networks-Simons-Lee/0332fdf00d7ff988c5b66c47afd49431eafa6cd1) #### - [Ternary weight networks](https://arxiv.org/abs/1605.04711) #### - [DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients](https://arxiv.org/abs/1606.06160) - [Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference](https://arxiv.org/abs/1712.05877) - [Quantizing deep convolutional networks for efficient inference: A whitepaper](https://arxiv.org/abs/1806.08342) ### - [Learning Efficient Convolutional Networks through Network Slimming](https://arxiv.org/abs/1708.06519) - [RETHINKING THE VALUE OF NETWORK PRUNING](https://arxiv.org/abs/1810.05270) ### - [Convolutional Networks for Fast, Energy-Efficient Neuromorphic Computing](https://arxiv.org/abs/1603.08270) ## - ## - 1Nvidia - 2 - 314bits//2DLMNNNCNNTensorRT
Owner
- Login: alixunxing
- Kind: user
- Repositories: 18
- Profile: https://github.com/alixunxing