Vo, Khoa
6 publications
NeurIPS
2024
HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model
WACV
2024
ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection
6 publications