Active Learning for Sampling in Time-Series Experiments with Application to Gene Expression Analysis

Abstract

Many time-series experiments seek to estimate some signal as a continuous function of time. In this paper, we address the sampling problem for such experiments: determining which time-points ought to be sampled in order to minimize the cost of data collection. We restrict our attention to a growing class of experiments which measure multiple signals at each time-point and where raw materials/observations are archived initially, and selectively analyzed later, this analysis being the more expensive step. We present an active learning algorithm for iteratively choosing time-points to sample, using the uncertainty in the quality of the currently estimated time-dependent curve as the objective function. Using simulated data as well as gene expression data, we show that our algorithm performs well, and can significantly reduce experimental cost without loss of information.

Cite

Text

Singh et al. "Active Learning for Sampling in Time-Series Experiments with Application to Gene Expression Analysis." International Conference on Machine Learning, 2005. doi:10.1145/1102351.1102456

Markdown

[Singh et al. "Active Learning for Sampling in Time-Series Experiments with Application to Gene Expression Analysis." International Conference on Machine Learning, 2005.](https://mlanthology.org/icml/2005/singh2005icml-active/) doi:10.1145/1102351.1102456

BibTeX

@inproceedings{singh2005icml-active,
  title     = {{Active Learning for Sampling in Time-Series Experiments with Application to Gene Expression Analysis}},
  author    = {Singh, Rohit and Palmer, Nathan P. and Gifford, David K. and Berger, Bonnie and Bar-Joseph, Ziv},
  booktitle = {International Conference on Machine Learning},
  year      = {2005},
  pages     = {832-839},
  doi       = {10.1145/1102351.1102456},
  url       = {https://mlanthology.org/icml/2005/singh2005icml-active/}
}