PAC-Bayes Bounds for Multivariate Linear Regression and Linear Autoencoders
Abstract
Linear Autoencoders (LAEs) have shown strong performance in state-of-the-art recommender systems. However, this success remains largely empirical, with limited theoretical understanding. In this paper, we investigate the generalizability -- a theoretical measure of model performance in statistical learning -- of multivariate linear regression and LAEs. We first propose a PAC-Bayes bound for multivariate linear regression, extending the earlier bound for single-output linear regression by Shalaeva et al., and establish sufficient conditions for its convergence. We then show that LAEs, when evaluated under a relaxed mean squared error, can be interpreted as constrained multivariate linear regression models on bounded data, to which our bound adapts. Furthermore, we develop theoretical methods to improve the computational efficiency of optimizing the LAE bound, enabling its practical evaluation on large models and real-world datasets. Experimental results demonstrate that our bound is tight and correlates well with practical ranking metrics such as Recall@K and NDCG@K.
Cite
Text
Guo et al. "PAC-Bayes Bounds for Multivariate Linear Regression and Linear Autoencoders." Advances in Neural Information Processing Systems, 2025.Markdown
[Guo et al. "PAC-Bayes Bounds for Multivariate Linear Regression and Linear Autoencoders." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/guo2025neurips-pacbayes/)BibTeX
@inproceedings{guo2025neurips-pacbayes,
title = {{PAC-Bayes Bounds for Multivariate Linear Regression and Linear Autoencoders}},
author = {Guo, Ruixin and Jin, Ruoming and Li, Xinyu and Zhou, Yang},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/guo2025neurips-pacbayes/}
}