ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Abstract
The k-nearest neighbor (kNN) query is a cornerstone of similarity-based applications across various domains. While prior work has enhanced kNN search efficiency, it typically focuses on approximate methods for high-dimensional data or exact methods for low-dimensional data, often assuming static query and data distributions. This creates a significant gap in accelerating exact kNN search for low-to-medium dimensional data with dynamic query distributions. To fill this gap, we propose App2Exa, a cache-guided framework that integrates approximate and exact kNN search. App2Exa utilizes a dynamically maintained cache graph index to retrieve approximate results, which subsequently guide exact search using a VP-Tree with a best-first strategy. A benefit-driven caching mechanism further optimizes performance by prioritizing vectors based on frequency, recency, and computational cost. Experimental results demonstrate that App2Exa significantly boosts efficiency, providing a robust and scalable solution for evolving query patterns and enabling exact kNN search to support higher dimensionality more effectively.
Cite
Text
Baechler et al. "ScreenAI: A Vision-Language Model for UI and Infographics Understanding." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/339Markdown
[Baechler et al. "ScreenAI: A Vision-Language Model for UI and Infographics Understanding." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/baechler2024ijcai-screenai/) doi:10.24963/ijcai.2024/339BibTeX
@inproceedings{baechler2024ijcai-screenai,
title = {{ScreenAI: A Vision-Language Model for UI and Infographics Understanding}},
author = {Baechler, Gilles and Sunkara, Srinivas and Wang, Maria and Zubach, Fedir and Mansoor, Hassan and Etter, Vincent and Carbune, Victor and Lin, Jason and Chen, Jindong and Sharma, Abhanshu},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2024},
pages = {3058-3068},
doi = {10.24963/ijcai.2024/339},
url = {https://mlanthology.org/ijcai/2024/baechler2024ijcai-screenai/}
}