New Algorithms for Trace-Ratio Problem with Application to High-Dimension and Large-Sample Data Dimensionality Reduction
Abstract
Learning large-scale data sets with high dimensionality is a main concern in research areas including machine learning, visual recognition, information retrieval, to name a few. In many practical uses such as images, video, audio, and text processing, we have to face with high-dimension and large-sample data problems. The trace-ratio problem is a key problem for feature extraction and dimensionality reduction to circumvent the high dimensional space. However, it has been long believed that this problem has no closed-form solution, and one has to solve it by using some inner-outer iterative algorithms that are very time consuming. Therefore, efficient algorithms for high-dimension and large-sample trace-ratio problems are still lacking, especially for dense data problems. In this work, we present a closed-form solution for the trace-ratio problem, and propose two algorithms to solve it. Based on the formula and the randomized singular value decomposition, we first propose a randomized algorithm for solving high-dimension and large-sample dense trace-ratio problems. For high-dimension and large-sample sparse trace-ratio problems, we then propose an algorithm based on the closed-form solution and solving some consistent under-determined linear systems. Theoretical results are established to show the rationality and efficiency of the proposed methods. Numerical experiments are performed on some real-world data sets, which illustrate the superiority of the proposed algorithms over many state-of-the-art algorithms for high-dimension and large-sample dimensionality reduction problems.
Cite
Text
Shi and Wu. "New Algorithms for Trace-Ratio Problem with Application to High-Dimension and Large-Sample Data Dimensionality Reduction." Machine Learning, 2024. doi:10.1007/S10994-020-05937-WMarkdown
[Shi and Wu. "New Algorithms for Trace-Ratio Problem with Application to High-Dimension and Large-Sample Data Dimensionality Reduction." Machine Learning, 2024.](https://mlanthology.org/mlj/2024/shi2024mlj-new/) doi:10.1007/S10994-020-05937-WBibTeX
@article{shi2024mlj-new,
title = {{New Algorithms for Trace-Ratio Problem with Application to High-Dimension and Large-Sample Data Dimensionality Reduction}},
author = {Shi, Wenya and Wu, Gang},
journal = {Machine Learning},
year = {2024},
pages = {3889-3916},
doi = {10.1007/S10994-020-05937-W},
volume = {113},
url = {https://mlanthology.org/mlj/2024/shi2024mlj-new/}
}