Leveraging Common Structure to Improve Prediction Across Related Datasets
Abstract
In many applications, training data is provided in the form of related datasets obtained from several sources, which typically affects the sample distribution. The learned classification models, which are expected to perform well on similar data coming from new sources, often suffer due to bias introduced by what we call `spurious' samples -- those due to source characteristics and not representative of any other part of the data. As standard outlier detection and robust classification usually fall short of determining groups of spurious samples, we propose a procedure which identifies the common structure across datasets by minimizing a multi-dataset divergence metric, increasing accuracy for new datasets.
Cite
Text
Barnes et al. "Leveraging Common Structure to Improve Prediction Across Related Datasets." AAAI Conference on Artificial Intelligence, 2015. doi:10.1609/AAAI.V29I1.9746Markdown
[Barnes et al. "Leveraging Common Structure to Improve Prediction Across Related Datasets." AAAI Conference on Artificial Intelligence, 2015.](https://mlanthology.org/aaai/2015/barnes2015aaai-leveraging/) doi:10.1609/AAAI.V29I1.9746BibTeX
@inproceedings{barnes2015aaai-leveraging,
title = {{Leveraging Common Structure to Improve Prediction Across Related Datasets}},
author = {Barnes, Matt and Gisolfi, Nick and Fiterau, Madalina and Dubrawski, Artur},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2015},
pages = {4144-4145},
doi = {10.1609/AAAI.V29I1.9746},
url = {https://mlanthology.org/aaai/2015/barnes2015aaai-leveraging/}
}