Field-Wise Learning for Multi-Field Categorical Data

Abstract

We propose a new method for learning with multi-field categorical data. Multi-field categorical data are usually collected over many heterogeneous groups. These groups can reflect in the categories under a field. The existing methods try to learn a universal model that fits all data, which is challenging and inevitably results in learning a complex model. In contrast, we propose a field-wise learning method leveraging the natural structure of data to learn simple yet efficient one-to-one field-focused models with appropriate constraints. In doing this, the models can be fitted to each category and thus can better capture the underlying differences in data. We present a model that utilizes linear models with variance and low-rank constraints, to help it generalize better and reduce the number of parameters. The model is also interpretable in a field-wise manner. As the dimensionality of multi-field categorical data can be very high, the models applied to such data are mostly over-parameterized. Our theoretical analysis can potentially explain the effect of over-parametrization on the generalization of our model. It also supports the variance constraints in the learning objective. The experiment results on two large-scale datasets show the superior performance of our model, the trend of the generalization error bound, and the interpretability of learning outcomes. Our code is available at https://github.com/lzb5600/Field-wise-Learning.

Cite

Text

Li et al. "Field-Wise Learning for Multi-Field Categorical Data." Neural Information Processing Systems, 2020.

Markdown

[Li et al. "Field-Wise Learning for Multi-Field Categorical Data." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/li2020neurips-fieldwise/)

BibTeX

@inproceedings{li2020neurips-fieldwise,
  title     = {{Field-Wise Learning for Multi-Field Categorical Data}},
  author    = {Li, Zhibin and Zhang, Jian and Gong, Yongshun and Yao, Yazhou and Wu, Qiang},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/li2020neurips-fieldwise/}
}