https://github.com/alixunxing/model-compression

model compression based on pytorch (1、quantization: 16/8/4/2 bits(dorefa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、ternary/binary value(twn/bnn/xnor-net);2、 pruning: normal、regular and group convolutional channel pruning;3、 group convolution structure;4、batch-normalization folding for quantization)

https://github.com/alixunxing/model-compression

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (2.3%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

model compression based on pytorch (1、quantization: 16/8/4/2 bits(dorefa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、ternary/binary value(twn/bnn/xnor-net);2、 pruning: normal、regular and group convolutional channel pruning;3、 group convolution structure;4、batch-normalization folding for quantization)

Basic Info
  • Host: GitHub
  • Owner: alixunxing
  • Default Branch: master
  • Homepage:
  • Size: 1.48 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of 666DZY666/micronet
Created almost 6 years ago · Last pushed about 6 years ago

https://github.com/alixunxing/model-compression/blob/master/

# model-compression

*""*


##  

**pytorch**

- 1(16/8/4/2 bits)/
- 2
- 3(A)BN
- 4(bits)BN
- 5


## 

- 1
- 2WA, W(FP32/16/8/4/2bits, /)  A(FP32/16/8/4/2bits, )
- 3/tricksW/W/gradstesaturate_stesoft_steAB-A-C-PC-B-A-Pacc
- 4modelfilterN(8,16)
- 5batch normalizationmodelBNBN > convwb(A)BNBN > convb)(bits)BN


## 

![img1](https://github.com/666DZY666/model-compression/blob/master/readme_imgs/code_structure.jpg)


## 
- **2019.12.4**
- **12.8**(A)(* 0.1)
- **12.11**
- 12.12
- 12.14:1BN(W/)W/2BN(conv)(bias)
- **12.17**()
- 12.20(cpugpu())
- **12.27**
- 12.298bits10bits16bits
- **2020.2.17**1W/2W
- **2.18**(A)BN:BNgammaBN
- **2.24**/./WbWtAb/models/util_w_t_b_conv.pyConv2d_Qnin_gc.py
- **3.1**1google(bits)2BN
- **3.23.3**(dorefa)models/util_xx.pyConv2d_Qnin_gc.py
- 3.4WbWtAb/bn_folding(A)BNBN(//())
- 3.11WqAq/IAOBNmomentum(0.1 > 0.01),batch,,,acc1%
- 3.13
- 4.6W_clip()(models/util_xx.py)

## 

- python >= 3.5
- torch >= 1.1.0
- torchvison >= 0.3.0
- numpy


## 

### 

#### WFP32//AFP32/

--W --A, WA

```
cd quantization/WbWtAb
```

- WbAb

```
python main.py --W 2 --A 2
```

- WbA32

```
python main.py --W 2 --A 32
```

- WtAb

```
python main.py --W 3 --A 2
```

- WtA32

```
python main.py --W 3 --A 32
```

#### WFP32/16/8/4/2 bitsAFP32/16/8/4/2 bits

--Wbits --Abits, WA

```
cd quantization/WqAq  cd quantization/IAO
```

##### dorefa

- W16A16

```
python main.py --Wbits 16 --Abits 16
```

- W8A8

```
python main.py --Wbits 8 --Abits 8
```

- W4A4

```
python main.py --Wbits 4 --Abits 4
```

- bits

##### IAO

**

--q_type, ; --bn_fold, bn

- , bn

```
python main.py --q_type 0 --bn_fold 0
```

- , bn

```
python main.py --q_type 1 --bn_fold 1
```

- 
  
### 

*  >    >  *

```
cd pruning
```

#### 

```
python main.py
```

#### 

-sr , --s (datasetmodel)

- nin()

```
python main.py -sr --s 0.0001
```

- nin_gc()

```
python main.py -sr --s 0.001
```

#### 

--percent , --normal_regular (N,filterN), --model model, --save model, 

- 

```
python normal_regular_prune.py --percent 0.5 --model models_save/nin_preprune.pth --save models_save/nin_prune.pth
```

- 

```
python normal_regular_prune.py --percent 0.5 --normal_regular 8 --model models_save/nin_preprune.pth --save models_save/nin_prune.pth
```



```
python normal_regular_prune.py --percent 0.5 --normal_regular 16 --model models_save/nin_preprune.pth --save models_save/nin_prune.pth
```

- 

```
python gc_prune.py --percent 0.4 --model models_save/nin_gc_preprune.pth
```

#### 

--refine model

```
python main.py --refine models_save/nin_prune.pth
```

###  > 

**

####  > 16/8/4/2 bits

```
cd quantization/WqAq  cd quantization/IAO
```

##### W8A8

- nin()

```
python main.py --Wbits 8 --Abits 8 --refine ../../../prune/models_save/nin_refine.pth
```

- nin_gc()

```
python main.py --Wbits 8 --Abits 8 --refine ../../../prune/models_save/nin_gc_refine.pth
```

##### bits

####  > /

```
cd quantization/WbWtAb
```

##### WbAb

- nin()

```
python main.py --W 2 --A 2 --refine ../../prune/models_save/nin_refine.pth
```

- nin_gc()

```
python main.py --W 2 --A 2 --refine ../../prune/models_save/nin_gc_refine.pth
```

##### 

### BN

```
cd quantization/WbWtAb/bn_folding
```

--W W(W(FP32//))

#### model

- Wb
  
```
python bn_folding.py --W 2
```

- Wt

```
python bn_folding.py --W 3
```

#### model

```
python bn_folding_model_test.py
```

### 

*cpugpu()*

--cpu cpu--gpu_id gpu

- cpu

```
python main.py --cpu
```

- gpu

```
python main.py --gpu_id 0
```



```
python main.py --gpu_id 1
```

- gpu

```
python main.py --gpu_id 0,1
```



```
python main.py --gpu_id 0,1,2
```

**


## 

*cifar10*

|                         |  Acc   | GFLOPs | Para(M) | Size(MB) |  |   |                                                |
| :-------------------------: | :----: | :----: | :-----: | :------: | :----: | :---: | :------------------------------------------------: |
|        nin        | 91.01% |  0.15  |  0.67   |   2.68   |  ***   |  ***  |                                              |
|  (nin_gc)   | 90.88% |  0.15  |  0.58   |   2.32   | 13.43% | 0.13% |                                              |
|                         | 90.26% |  0.09  |  0.32   |   1.28   | 44.83% | 0.62% |                                              |
|        (W/A)        | 90.02% |  ***   |   ***   |   0.18   | 92.21% | 0.86% |                  W                   |
|      (W/A)      | 87.68% |  ***   |   ***   |   0.26   | 88.79% | 3.20% | W, |
|   +(W/A)    | 86.13% |  ***   |   ***   |   0.19   | 91.81% | 4.75% | W, |
| ++(W/A) | 86.13% |  ***   |   ***   |   0.19   | 92.91% | 4.88% | W, |

*nin_gc   nin nin_gc *


## 

### 

#### 

- [BinarizedNeuralNetworks: TrainingNeuralNetworkswithWeightsand ActivationsConstrainedto +1 or1](https://arxiv.org/abs/1602.02830)

- [XNOR-Net:ImageNetClassicationUsingBinary ConvolutionalNeuralNetworks](https://arxiv.org/abs/1603.05279)

- [AN EMPIRICAL STUDY OF BINARY NEURAL NETWORKS OPTIMISATION](https://openreview.net/forum?id=rJfUCoR5KX)

- [A Review of Binarized Neural Networks](https://www.semanticscholar.org/paper/A-Review-of-Binarized-Neural-Networks-Simons-Lee/0332fdf00d7ff988c5b66c47afd49431eafa6cd1)

#### 

- [Ternary weight networks](https://arxiv.org/abs/1605.04711)

#### 

- [DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients](https://arxiv.org/abs/1606.06160)
- [Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference](https://arxiv.org/abs/1712.05877)
- [Quantizing deep convolutional networks for efficient inference: A whitepaper](https://arxiv.org/abs/1806.08342)

### 

- [Learning Efficient Convolutional Networks through Network Slimming](https://arxiv.org/abs/1708.06519)
- [RETHINKING THE VALUE OF NETWORK PRUNING](https://arxiv.org/abs/1810.05270)

### 

- [Convolutional Networks for Fast, Energy-Efficient Neuromorphic Computing](https://arxiv.org/abs/1603.08270)


## 

- 
  

## 

- 1Nvidia
- 2
- 314bits//2DLMNNNCNNTensorRT

Owner

  • Login: alixunxing
  • Kind: user

GitHub Events

Total
Last Year