Towards Automatic Feature Construction for Supervised Classification

Abstract

We suggest an approach to automate variable construction for supervised learning, especially in the multi-relational setting. Domain knowledge is specified by describing the structure of data by the means of variables, tables and links across tables, and choosing construction rules. The space of variables that can be constructed is virtually infinite, which raises both combinatorial and over-fitting problems. We introduce a prior distribution over all the constructed variables, as well as an effective algorithm to draw samples of constructed variables from this distribution. Experiments show that the approach is robust and efficient.

Cite

Text

Boullé. "Towards Automatic Feature Construction for Supervised Classification." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2014. doi:10.1007/978-3-662-44848-9_12

Markdown

[Boullé. "Towards Automatic Feature Construction for Supervised Classification." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2014.](https://mlanthology.org/ecmlpkdd/2014/boulle2014ecmlpkdd-automatic/) doi:10.1007/978-3-662-44848-9_12

BibTeX

@inproceedings{boulle2014ecmlpkdd-automatic,
  title     = {{Towards Automatic Feature Construction for Supervised Classification}},
  author    = {Boullé, Marc},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2014},
  pages     = {181-196},
  doi       = {10.1007/978-3-662-44848-9_12},
  url       = {https://mlanthology.org/ecmlpkdd/2014/boulle2014ecmlpkdd-automatic/}
}