Quantile Regression for Large-Scale Applications

Abstract

Quantile regression is a method to estimate the quantiles of the conditional distribution of a response variable, and as such it permits a much more accurate portrayal of the relationship between the response variable and observed covariates than methods such as Least-squares or Least Absolute Deviations regression. It can be expressed as a linear program, and interior-point methods can be used to find a solution for moderately large problems. Dealing with very large problems, \emphe.g., involving data up to and beyond the terabyte regime, remains a challenge. Here, we present a randomized algorithm that runs in time that is nearly linear in the size of the input and that, with constant probability, computes a (1+ε) approximate solution to an arbitrary quantile regression problem. Our algorithm computes a low-distortion subspace-preserving embedding with respect to the loss function of quantile regression. Our empirical evaluation illustrates that our algorithm is competitive with the best previous work on small to medium-sized problems, and that it can be implemented in MapReduce-like environments and applied to terabyte-sized problems.

Cite

Text

Yang et al. "Quantile Regression for Large-Scale Applications." International Conference on Machine Learning, 2013.

Markdown

[Yang et al. "Quantile Regression for Large-Scale Applications." International Conference on Machine Learning, 2013.](https://mlanthology.org/icml/2013/yang2013icml-quantile/)

BibTeX

@inproceedings{yang2013icml-quantile,
  title     = {{Quantile Regression for Large-Scale Applications}},
  author    = {Yang, Jiyan and Meng, Xiangrui and Mahoney, Michael},
  booktitle = {International Conference on Machine Learning},
  year      = {2013},
  pages     = {881-887},
  volume    = {28},
  url       = {https://mlanthology.org/icml/2013/yang2013icml-quantile/}
}