RAVN: Reinforcement Aided Adaptive Vector Quantization of Deep Neural Networks

Abstract

In the expanding field of deep learning, deploying deep neural networks (DNNs) in resource-constrained environments presents daunting challenges due to their complexity. Existing methodologies try to reduce the model complexity through the quantization of the DNNs. Adaptive quantization (AQ) is one such quantization technique for reducing model complexity. The drawbacks of current adaptive quantization techniques include limited adaptability to different datasets and models, suboptimal codebook generation, high computational complexity, and limited generalization to unseen scenarios. In contrast, we propose to address these issues through a sophisticated AQ methodology which incorporates vector quantization (VQ) of weights and Quantization-Aware Training (QAT) in tandem with reinforcement learning (RL). The above-mentioned approach facilitates dynamic allocation of quantization parameters of the DNN models, thereby reducing complexity, power utilization and ease of deployment on edge devices. We evaluated our proposed approach on three publicly available benchmark datasets namely, CIFAR-10, CIFAR-100 and ImageNet on state-of-the-art floating-point DNN architectures and showed a boost of up to 4% in their respective quantized counterparts. The source code of the proposed approach will be available here upon acceptance of the work.

Cite

Text

Jha et al. "RAVN: Reinforcement Aided Adaptive Vector Quantization of Deep Neural Networks." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00225

Markdown

[Jha et al. "RAVN: Reinforcement Aided Adaptive Vector Quantization of Deep Neural Networks." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/jha2024cvprw-ravn/) doi:10.1109/CVPRW63382.2024.00225

BibTeX

@inproceedings{jha2024cvprw-ravn,
  title     = {{RAVN: Reinforcement Aided Adaptive Vector Quantization of Deep Neural Networks}},
  author    = {Jha, Anamika and Chattopadhyay, Aratrik and Banerji, Mrinal and Jain, Disha},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2024},
  pages     = {2200-2209},
  doi       = {10.1109/CVPRW63382.2024.00225},
  url       = {https://mlanthology.org/cvprw/2024/jha2024cvprw-ravn/}
}