Compression-Aware Computing for Scalable and Sustainable AI

Abstract

This talk explores the challenge of customizing large-scale AI models, particularly generative AI, on cost-effective devices with limited memory and energy resources. Modern AI models demand substantial computational power, often relying on specialized hardware such as GPUs. To address this, the talk introduces compression-aware computing, a framework enabling AI models to recognize and adapt to their compressed states while preserving performance. Compression-aware computing integrates compression techniques like sparsification, quantization, and low-rank decomposition to enhance the efficiency and accuracy of AI models, broadening these models' accessibility across diverse devices. Additionally, this talk highlights one rationale of scalable and sustainable AI in advancing Alzheimer’s research by facilitating the analysis of large single-cell transcriptomics datasets for gene-gene interaction discovery.

Cite

Text

Xu. "Compression-Aware Computing for Scalable and Sustainable AI." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I27.35126

Markdown

[Xu. "Compression-Aware Computing for Scalable and Sustainable AI." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/xu2025aaai-compression/) doi:10.1609/AAAI.V39I27.35126

BibTeX

@inproceedings{xu2025aaai-compression,
  title     = {{Compression-Aware Computing for Scalable and Sustainable AI}},
  author    = {Xu, Zhaozhuo},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {28733},
  doi       = {10.1609/AAAI.V39I27.35126},
  url       = {https://mlanthology.org/aaai/2025/xu2025aaai-compression/}
}