∇QDARTS: Quantization as an Elastic Dimension to Differentiable NAS
Abstract
Differentiable Neural Architecture Search methods efficiently find high-accuracy architectures using gradient-based optimization in a continuous domain, saving computational resources. Mixed-precision search helps optimize precision within a fixed architecture. However, applying it to a NAS-generated network does not assure optimal performance as the optimized quantized architecture may not emerge from a standalone NAS method. In light of these considerations, this paper introduces ∇QDARTS, a novel approach that combines differentiable NAS with mixed-precision search for both weight and activation. ∇QDARTS aims to identify the optimal mixed-precision neural architecture capable of achieving remarkable accuracy while operating with minimal computational requirements in a single-shot, end-to-end differentiable framework, obviating the need for pretraining and proxy methods. Compared to fp32, ∇QDARTS shows impressive performance on CIFAR10 with (2,4) bit precision, reducing bit operations by 160× with a slight 1.57% accuracy drop. Increasing the capacity enables ∇QDARTS to match fp32 accuracy while reducing bit operations by 18×. For the ImageNet dataset, with just (2,4) bit precision, ∇QDARTS outperforms state-of-the-art methods such as APQ, SPOS, OQA, and MNAS by 2.3%, 2.9%, 0.3%, and 2.7% in terms of accuracy. By incorporating (2,4,8) bit precision, ∇QDARTS further minimizes the accuracy drop to 1% compared to fp32, alongside a substantial reduction of 17× in required bit operations and 2.6× in memory footprint. In terms of bit-operation (memory footprint) ∇QDARTS excels over APQ, SPOS, OQA, and MNAS with similar accuracy by 2.3× (12×), 2.4× (3×), 13% (6.2×), 3.4× (37%), for bit-operation (memory footprint), respectively. ∇QDARTS enhances the overall search and training efficiency, achieving a 3.1× and 1.54× improvement over APQ and OQA, respectively.
Cite
Text
Behnam et al. "∇QDARTS: Quantization as an Elastic Dimension to Differentiable NAS." Transactions on Machine Learning Research, 2025.Markdown
[Behnam et al. "∇QDARTS: Quantization as an Elastic Dimension to Differentiable NAS." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/behnam2025tmlr-qdarts/)BibTeX
@article{behnam2025tmlr-qdarts,
title = {{∇QDARTS: Quantization as an Elastic Dimension to Differentiable NAS}},
author = {Behnam, Payman and Kamal, Uday and Ganesh, Sanjana Vijay and Li, Zhaoyi and Jurado, Michael Andrew and Khare, Alind and Fedorov, Igor and Liu, Gaowen and Tumanov, Alexey},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/behnam2025tmlr-qdarts/}
}