Training Block-Wise Sparse Models Using Kronecker Product Decomposition

Abstract

Large-scale machine learning (ML) models are increasingly being used in critical domains like education, lending, recruitment, healthcare, criminal justice, etc. However, the training, deployment, and utilization of these models demand substantial computational resources. To decrease computation and memory costs, machine learning models with sparse weight matrices are widely used in the literature. Among sparse models, those with special sparse structures (e.g., models with block-wise sparse weight matrices) fit generally better with the hardware accelerators and can decrease the memory and computation costs during the inference. Unfortunately, while weight matrices with special sparsity patterns can make the models efficient during inference, there is no efficient method for training these models. In particular, existing training methods for block-wise sparse models start with full and dense models leading to an inefficient training process. In this work, we focus on training models with block-wise sparse matrices and propose an efficient training algorithm to decrease both computation and memory costs during the training. Our extensive empirical and theoretical analyses show that our proposed algorithms can decrease the computation and memory costs significantly without a performance drop compared to baselines.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Zhu et al. "Training Block-Wise Sparse Models Using Kronecker Product Decomposition." NeurIPS 2024 Workshops: Compression, 2024.

Markdown

[Zhu et al. "Training Block-Wise Sparse Models Using Kronecker Product Decomposition." NeurIPS 2024 Workshops: Compression, 2024.](https://mlanthology.org/neuripsw/2024/zhu2024neuripsw-training/)

BibTeX

@inproceedings{zhu2024neuripsw-training,
  title     = {{Training Block-Wise Sparse Models Using Kronecker Product Decomposition}},
  author    = {Zhu, Ding and Zuo, Zhiqun and Khalili, Mohammad Mahdi},
  booktitle = {NeurIPS 2024 Workshops: Compression},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/zhu2024neuripsw-training/}
}