paper-of-quantization

PTQvsTAQ

c1. Post-Training Quantization(PTQ)

ZeroQ, EasyQuant, LAPQ, ACIQ,
DFQ

c2. Training-Aware Quantization(TAQ)


S&Q

3. Binary, Ternary, 3-4-5-bit, Flexible

c1. Binary

c2. Ternary

c3. 3-4-5-bit

c4. Flexible


2020

ECCV2020

  1. PWLQ: Post-Training Piecewise Linear Quantization for Deep Neural Networks. Samsung

CVPR2020

  1. ZeroQ: A Novel Zero-Shot Quantization Framework. Berkeley, Peking University
  2. AdaBits: Neural Network Quantization with Adaptive Bit-Widths. ByteDance direct adaption, progressive training and joint training 三种方法来量化模型
  3. BiDet: An Efficient Binarized Object Detector. CVPR2020
  4. APQ: Joint Search for Network Architecture, Pruning and Quantization Policy. CVPR2020 MIT architecture, channel pruning, HAQ三者结合的NAS方法,可操作性确实更好,motivation也合适。
  5. IR-Net: Forward and Backward Information Retention for Accurate Binary Neural Networks. CVPR2020

ICLR2020

  1. LSQ: Learned Step Size Quantization ICLR2020
  2. Mixed Precision DNNs: All you need is a good parametrization ICLR2020 sony
  3. SAT: Rethinking neural network quantization. Scale-Adjusted Training ICLR2020 reject paper
  4. LLSQ: Learned Symmetric Quantization of Neural Networks for Low-precision Integer Hardware. ICLR2020 ICT
  1. HAWQv2: Hessian Aware trace-Weighted Quantization of Neural Networks
  2. SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of DNNs with Ultra-High Efficiency. IBM Watson Lab
  3. BBG: Balanced Binary Neural Networks with Gated Residual.
  4. EasyQuant: Post-training Quantization via Scale Optimization. 优化每层的输出的与余弦距离 code中给出了加速效果实验
  5. LAPQ: Loss Aware Post-training Quantization code intel AIPG ACIQ的进化版,方法更加简洁,实现更直接,效果也更加好

2019

ICLR2019

  1. ACIQ(pre): analytical clipping for integer quantization of neural networks. ICLR2019 reject intel AIPG
  2. Per-Tensor Fixed-point quantization of the back-propagation algorithm. ICLR2019
  3. RQ: Relaxed Quantization for discretized NNs. ICLR2019

NIPS2019

  1. ACIQ Post training 4-bit quantization of convolution networks for rapid-deployment. NIPS 2019 AIPG, Intel

ICCV2019

  1. DSQ: Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks. ICCV2019 SenseTime, Beihang
  2. DFQ: Data-Free Quantization through Weight Equalization and Bias Correction. Qualcomm 高通

CVPR

  1. FQN: Fully Quantized Network for Object Detection. CVPR2019
  2. QIL: Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss CVPR2019

Other

  1. SAWB: Accurate and efficient 2-bit quantized neural networks. sysml2019
  2. SQuantizer: Simultaneous Learning for Both Sparse and Low-precision Neural Networks. 2019 AIPG, Intel
  3. Distributed Low Precision Training Without Mixed Precision. Oxford snowcloud.ai
  4. Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers. ICT Cambricon int8 for weights and activations, int16 for most of the gradients. 通过量化加快训练过程

  5. WAGEUBN: Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers. 应该算是WAGE的进阶版了

2018

ICLR2018

  1. VNQ: Variational network quantization. ICLR2018
  2. WAGE: Training and Inference with Integers in Deep Neural Networks. ICLR2018 oral tsinghua 不仅量化了weight,activation还量化了error, gradient.
  3. Alternating multi-bit quantization for recurrent neural networks. ICLR2018 alibaba
  4. Mixed Precision Training. FP16 training ICLR2018 baidu
  5. Model Compression via distillation and quantization. ICLR2018 google
  6. Quantized back-propagation: training binarized neural networks with quantized gradients. ICLR2018

CVPR2018

  1. Clip-Q: Deep network compression learning by In-Parallel Pruning Quantization. CVPR2018 SFU quantization 与pruning同时进行,达到更优的压缩结果。先使用贝叶斯优化搜索 layer-wise的(p,q),然后在进行fine-turning,达到最大目的的压缩权重,没有涉及到激活值的量化和压缩
  2. ELQ: Explicit loss-error-aware quantization for low-bit deep neural networks. CVPR2018 intel tsinghua
  3. Quantization and training of neural networks for efficient integer-arithmetic-only inference. CVPR2018 Google
  4. TSQ: two-step quantization for low-bit neural networks. CVPR2018
  5. SYQ: learning symmetric quantization for efficient deep neural networks. CVPR2018 xilinx
  6. Towards Effective Low-bitwidth Convolutional Neural Networks. CVPR2018

ECCV2018

  1. LQ-NETs: learned quantization for highly accurate and compact deep neural networks. ECCV2018 Microsoft
  2. Bi-Real Net: Enhancing the performance of 1-bit CNNs with improved Representational capability and advanced training algorithm. ECCV2018 HKU
  3. V-Quant: Value-aware quantization for training and inference of neural networks. ECCV2018 facebook

NIPS2018

  1. Heterogeneous Bitwidth Binarization in Convolutional Neural Networks. NIPS2018 microsoft
  2. HAQ: Hardware-Aware automated quantization. NIPS workshop 2018 mit
  3. Scalable methods for 8-bits training of neural networks. NIPS2018 intel

AAAI2018

  1. From Hashing to CNNs: training Binary weights vis hashing. AAAI2018 nlpr

Other

  1. Synergy: Algorithm-hardware co-design for convnet accelerators on embedded FPGAs. 2018 UC Berkeley
  2. Efficient Non-uniform quantizer for quantized neural network targeting Re-configurable hardware. 2018
  3. HALP: High-Accuracy Low-Precision Training. 2018 stanford
  4. PACT: parameterized clipping activation for quantized neural networks. 2018 IBM
  5. QUENN: Quantization engine for low-power neural networks. CF18ACM
  6. UNIQ: Uniform noise injection for non-uniform quantization of neural networks. 2018
  7. Training competitive binary neural networks from scratch. 2018
  8. A white-paper: Quantizing deep convolutional networks for efficient inference. 2018 google

2015-2016

  1. Deep learning with limited numerical precision. 2015 IBM
  2. DoReFa-Net: Training low bit-width convolutional neural networks with low bit-width gradients. 2016
  3. BNN: Binarized Neural Networks. NIPS2016
  4. TWNs: Ternary weight networks. NIPS2016 ucas
  5. XNOR-Net: ImageNet Classification using binary convolutional neural networks. ECCV2016 washington
  6. Hardware-oriented approximation of convolutional neural networks. ICLR2016
  7. Quantized convolutional neural networks for mobile devices. CVPR2016 nlpr

2017

  1. Flexpoint: an adaptive numerical format for efficient training of deep neural networks. 2017 intel
  2. INQ: Incremental network quantization, towards lossless CNNs with low-precision weights. ICLR2017 intel labs china
  3. TTQ: Trained ternary quantization. ICLR2017 stanford
  4. WRPN: wide reduced-precision networks. 2017 Accelerator Architecture Lab, Intel
  5. HWGQ: Deep Learning with Low Precision by Half-wave Gaussian Quantization. CVPR2017
  6. A Survey of Model Compression and Acceleration for Deep Neural Networks. 2017
  7. LP-SGD Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent ISCA2017
  8. How to Train a Compact Binary Neural Network with High Accuracy? NLPR MicroSoft

Other

  1. Fixed point quantization of deep convolutional networks. 2016
  2. Training a binary weight object detector by knowledge transfer for autonomous driving. 2018
  3. Low-bit Quantization of Neural Networks for Efficient Inference. 2019 huawei

top