Dataset Evolver: An Interactive Feature Engineering Notebook
Abstract
We present DATASET EVOLVER, an interactive Jupyter notebook-based tool to support data scientists perform feature engineering for classification tasks. It provides users with suggestions on new features to construct, based on automated feature engineering algorithms. Users can navigate the given choices in different ways, validate the impact, and selectively accept the suggestions. DATASET EVOLVER is a pluggable feature engineering framework where several exploration strategies could be added. It currently includes meta-learning based exploration and reinforcement learning based exploration. The suggested features are constructed using well-defined mathematical functions and are easily interpretable. Our system provides a mixed-initiative system of a user being assisted by an automated agent to efficiently and effectively solve the complex problem of feature engineering. It reduces the effort of a data scientist from hours to minutes.
Cite
Text
Nargesian et al. "Dataset Evolver: An Interactive Feature Engineering Notebook." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.11369Markdown
[Nargesian et al. "Dataset Evolver: An Interactive Feature Engineering Notebook." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/nargesian2018aaai-dataset/) doi:10.1609/AAAI.V32I1.11369BibTeX
@inproceedings{nargesian2018aaai-dataset,
title = {{Dataset Evolver: An Interactive Feature Engineering Notebook}},
author = {Nargesian, Fatemeh and Khurana, Udayan and Pedapati, Tejaswini and Samulowitz, Horst and Turaga, Deepak S.},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2018},
pages = {8212-8213},
doi = {10.1609/AAAI.V32I1.11369},
url = {https://mlanthology.org/aaai/2018/nargesian2018aaai-dataset/}
}