LassoNet: Neural Networks with Feature Sparsity

Abstract

Much work has been done recently to make neural networks more interpretable, and one approach is to arrange for the network to use only a subset of the available features. In linear models, Lasso (or $\ell_1$-regularized) regression assigns zero weights to the most irrelevant or redundant features, and is widely used in data science. However the Lasso only applies to linear models. Here we introduce LassoNet, a neural network framework with global feature selection. Our approach achieves feature sparsity by allowing a feature to participate in a hidden unit only if its linear representative is active. Unlike other approaches to feature selection for neural nets, our method uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly. As a result, it delivers an entire regularization path of solutions with a range of feature sparsity. In experiments with real and simulated data, LassoNet significantly outperforms state-of-the-art methods for feature selection and regression. The LassoNet method uses projected proximal gradient descent, and generalizes directly to deep networks. It can be implemented by adding just a few lines of code to a standard neural network.

Cite

Text

Lemhadri et al. " LassoNet: Neural Networks with Feature Sparsity ." Artificial Intelligence and Statistics, 2021.

Markdown

[Lemhadri et al. " LassoNet: Neural Networks with Feature Sparsity ." Artificial Intelligence and Statistics, 2021.](https://mlanthology.org/aistats/2021/lemhadri2021aistats-lassonet/)

BibTeX

@inproceedings{lemhadri2021aistats-lassonet,
  title     = {{ LassoNet: Neural Networks with Feature Sparsity }},
  author    = {Lemhadri, Ismael and Ruan, Feng and Tibshirani, Rob},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2021},
  pages     = {10-18},
  volume    = {130},
  url       = {https://mlanthology.org/aistats/2021/lemhadri2021aistats-lassonet/}
}