POCA: Post-Training Quantization with Temporal Alignment for Codec Avatars

Jian Meng, Yuecheng Li, Leo Li, Syed Shakib Sarwar, Dilin Wang, Jae-sun Seo

ECCV 2024

doi:10.1007/978-3-031-73661-2_13 /eccv/2024/meng2024eccv-poca/

Abstract

Real-time decoding generates high-quality assets for rendering photorealistic Codec Avatars for immersive social telepresence with AR/VR. However, high-quality avatar decoding incurs expensive computation and memory consumption, which necessitates the design of a decoder compression algorithm (e.g., quantization). Although quantization has been widely studied, the quantization of avatar decoders is an urgent yet under-explored need. Furthermore, the requirement of fast “User-Avatar” deployment prioritizes the post-training quantization (PTQ) over the time-consuming quantization-aware training (QAT). As the first work in this area, we reveal the sensitivity of the avatar decoding quality under low precision. In particular, the state-of-the-art (SoTA) QAT and PTQ algorithms introduce massive amounts of temporal noise to the rendered avatars, even with the well-established 8-bit precision. To resolve these issues, a novel PTQ algorithm is proposed for quantizing the avatar decoder with low-precision weights and activation (8-bit and 6-bit), without introducing temporal noise to the rendered avatar. Furthermore, the proposed method only needs 10% of the activations of each layer to calibrate quantization parameters without any distribution manipulations or extensive boundary search. The proposed method is evaluated on various face avatars with different facial characteristics. The proposed method compresses the decoder model by 5.3× while recovering the quality on par with the full precision baseline. In addition to the avatar rendering tasks, POCA is also applicable to image resolution enhancement tasks, achieving new SoTA image quality. https://mengjian0502.github.io/poca. github.io/

PDF ECCV Semantic Scholar

Cite

Text

Meng et al. "POCA: Post-Training Quantization with Temporal Alignment for Codec Avatars." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73661-2_13

Markdown

[Meng et al. "POCA: Post-Training Quantization with Temporal Alignment for Codec Avatars." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/meng2024eccv-poca/) doi:10.1007/978-3-031-73661-2_13

BibTeX

@inproceedings{meng2024eccv-poca,
  title     = {{POCA: Post-Training Quantization with Temporal Alignment for Codec Avatars}},
  author    = {Meng, Jian and Li, Yuecheng and Li, Leo and Sarwar, Syed Shakib and Wang, Dilin and Seo, Jae-sun},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73661-2_13},
  url       = {https://mlanthology.org/eccv/2024/meng2024eccv-poca/}
}