"Knock! Knock! Who Is It?" Probabilistic Person Identification in TV-Series

Abstract

We describe a probabilistic method for identifying characters in TV series or movies. We aim at labeling every character appearance, and not only those where a face can be detected. Consequently, our basic unit of appearance is a person track (as opposed to a face track). We model each TV series episode as a Markov Random Field, integrating face recognition, clothing appearance, speaker recognition and contextual constraints in a probabilistic manner. The identification task is then formulated as an energy minimization problem. In order to identify tracks without faces, we learn clothing models by adapting available face recognition results. Within a scene, as indicated by prior analysis of the temporal structure of the TV series, clothing features are combined by agglomerative clustering. We evaluate our approach on the first 6 episodes of The Big Bang Theory and achieve an absolute improvement of 20% for person identification and 12% for face recognition.

Cite

Text

Tapaswi et al. ""Knock! Knock! Who Is It?" Probabilistic Person Identification in TV-Series." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012. doi:10.1109/CVPR.2012.6247986

Markdown

[Tapaswi et al. ""Knock! Knock! Who Is It?" Probabilistic Person Identification in TV-Series." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012.](https://mlanthology.org/cvpr/2012/tapaswi2012cvpr-knock/) doi:10.1109/CVPR.2012.6247986

BibTeX

@inproceedings{tapaswi2012cvpr-knock,
  title     = {{"Knock! Knock! Who Is It?" Probabilistic Person Identification in TV-Series}},
  author    = {Tapaswi, Makarand and Bäuml, Martin and Stiefelhagen, Rainer},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2012},
  pages     = {2658-2665},
  doi       = {10.1109/CVPR.2012.6247986},
  url       = {https://mlanthology.org/cvpr/2012/tapaswi2012cvpr-knock/}
}