HAVANA: Hierarchical Stochastic Neighbor Embedding for Accelerated Video ANnotAtions
Abstract
Video annotation is a critical and time-consuming task in computer vision research and applications. This paper presents a novel annotation pipeline that uses pre-extracted features and dimensionality reduction to accelerate the temporal video annotation process. Our approach uses Hierarchical Stochastic Neighbor Embedding (HSNE) to create a multi-scale representation of video features, allowing annotators to efficiently explore and label large video datasets. We demonstrate significant improvements in annotation effort compared to traditional linear methods, achieving more than a 10x reduction in clicks required for annotating over 12 h of video. Our experiments on multiple datasets show the effectiveness and robustness of our pipeline across various scenarios. Moreover, we investigate the optimal configuration of HSNE parameters for different datasets. Our work provides a promising direction for scaling up video annotation efforts in the era of video understanding.
Cite
Text
Bobe and van Gemert. "HAVANA: Hierarchical Stochastic Neighbor Embedding for Accelerated Video ANnotAtions." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-92591-7_9Markdown
[Bobe and van Gemert. "HAVANA: Hierarchical Stochastic Neighbor Embedding for Accelerated Video ANnotAtions." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/bobe2024eccvw-havana/) doi:10.1007/978-3-031-92591-7_9BibTeX
@inproceedings{bobe2024eccvw-havana,
title = {{HAVANA: Hierarchical Stochastic Neighbor Embedding for Accelerated Video ANnotAtions}},
author = {Bobe, Alexandru and van Gemert, Jan C.},
booktitle = {European Conference on Computer Vision Workshops},
year = {2024},
pages = {134-150},
doi = {10.1007/978-3-031-92591-7_9},
url = {https://mlanthology.org/eccvw/2024/bobe2024eccvw-havana/}
}