ShARc: Shape and Appearance Recognition for Person Identification In-the-Wild
Abstract
Identifying individuals in unconstrained video settings is a valuable yet challenging task in biometric analysis due to variations in appearances, environments, degradations, and occlusions. In this paper, we present ShARc, a multimodal approach for video-based person identification in uncontrolled environments that emphasizes 3-D body shape, pose, and appearance. We introduce two encoders: a Pose and Shape Encoder (PSE) and an Aggregated Appearance Encoder (AAE). PSE encodes the body shape via binarized silhouettes, skeleton motions, and 3-D body shape, while AAE provides two levels of temporal appearance feature aggregation: attention-based feature aggregation and averaging aggregation. For attention-based feature aggregation, we employ spatial and temporal attention to focus on key areas for person distinction. For averaging aggregation, we introduce a novel flattening layer after averaging to extract more distinguishable information and reduce overfitting of attention. We utilize centroid feature averaging for gallery registration. We demonstrate significant improvements over existing state-of-the-art methods on public datasets, including CCVID, MEVID, and BRIAR.
Cite
Text
Zhu et al. "ShARc: Shape and Appearance Recognition for Person Identification In-the-Wild." Winter Conference on Applications of Computer Vision, 2024.Markdown
[Zhu et al. "ShARc: Shape and Appearance Recognition for Person Identification In-the-Wild." Winter Conference on Applications of Computer Vision, 2024.](https://mlanthology.org/wacv/2024/zhu2024wacv-sharc/)BibTeX
@inproceedings{zhu2024wacv-sharc,
title = {{ShARc: Shape and Appearance Recognition for Person Identification In-the-Wild}},
author = {Zhu, Haidong and Zheng, Wanrong and Zheng, Zhaoheng and Nevatia, Ram},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2024},
pages = {6290-6300},
url = {https://mlanthology.org/wacv/2024/zhu2024wacv-sharc/}
}