Skip to main content Link Menu Expand (external link) Document Search Copy Copied

NIPQ: Noise proxy-based Integrated Pseudo-Quantization

2024-07-06


Keywords: #noise #activationdistribution


1. Introduction

  • STE: The quantization operator is not differentiable, but the STE allows the backprop of the quantized data based on linear approximation.
    • STE-based QAT schemes can be quantized into 4-bit w/o accuracy loss.
    • However, STE-based QAT bypasses the approximated gradient, not the true gradient. → Incurs instability and bias during training.
  • Pseudo-quantization training (PQT): Based on pseudo-quantization noise (PQN)
    • The behavior of quantization operator is simulated via PQN
    • Learnable parameters are updated based on the proxy of quantization.
  • Truncation contributes significantly to reducing quantization errors.
  • Proposal
    1. NIPQ is the first PQT that integrates truncation. → Reduces weight quantization error, but also enables PQT for activation quantization.
    2. NIPQ optimizes network into mixed-precision with awareness of the given resource constraint w/o human intervention.
    3. Theoretical analysis showing that NIPQ updates parameters toward minimizing the quantization error.
    4. Experiments w/ NIPQ

4. Motivation

4.1 Limitation of STE-based Quantization

  • Biggest problem of STE: Parameters never converge to the target value; instead it oscillates near the rounding boundary of two adjacent quantization levels.
  • For more info: Overcoming Oscillations in Quantization-Aware Training
  • Oscillation near the rounding boundary becomes the major souce of large quantization errors.

    4.2 Pros and Cons of Previous PQN-based PQT

  • Pro
    • PQN-based PQT is expected to have lower quantization error after the training. (How? not very clear in the paper)
  • Con
    • Existing studies do not provide the theoretical integration of trucation on top of a PQN-based PQT framework. (Again, How? not very clear)

5. NIPQ: Noise Proxy-based Integrated Pseudo-Quantization