Nonlinear Estimators and Tail Bounds for Dimension Reduction in L 1 Using Cauchy Random Projections
Abstract
For dimension reduction in l _1, one can multiply a data matrix A ∈ ℝ^ n × D by R ∈ ℝ^ D × k ( k ≪ D ) whose entries are i.i.d. samples of Cauchy. The impossibility result says one can not recover the pairwise l _1 distances in A from B = AR ∈ ℝ^ n × k , using linear estimators. However, nonlinear estimators are still useful for certain applications in data stream computations, information retrieval, learning, and data mining. We propose three types of nonlinear estimators: the bias-corrected sample median estimator, the bias-corrected geometric mean estimator, and the bias-corrected maximum likelihood estimator. We derive tail bounds for the geometric mean estimator and establish that $k = O\left(\frac{\log n}{\epsilon^2}\right)$ suffices with the constants explicitly given. Asymptotically (as k → ∞), both the sample median estimator and the geometric mean estimator are about 80% efficient compared to the maximum likelihood estimator (MLE). We analyze the moments of the MLE and propose approximating the distribution of the MLE by an inverse Gaussian.
Cite
Text
Li et al. "Nonlinear Estimators and Tail Bounds for Dimension Reduction in L 1 Using Cauchy Random Projections." Annual Conference on Computational Learning Theory, 2007. doi:10.1007/978-3-540-72927-3_37Markdown
[Li et al. "Nonlinear Estimators and Tail Bounds for Dimension Reduction in L 1 Using Cauchy Random Projections." Annual Conference on Computational Learning Theory, 2007.](https://mlanthology.org/colt/2007/li2007colt-nonlinear/) doi:10.1007/978-3-540-72927-3_37BibTeX
@inproceedings{li2007colt-nonlinear,
title = {{Nonlinear Estimators and Tail Bounds for Dimension Reduction in L 1 Using Cauchy Random Projections}},
author = {Li, Ping and Hastie, Trevor and Church, Kenneth Ward},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2007},
pages = {514-529},
doi = {10.1007/978-3-540-72927-3_37},
url = {https://mlanthology.org/colt/2007/li2007colt-nonlinear/}
}