BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for Digital Behavioural Change

Abstract

Ambivalence and hesitancy (A/H), closely related constructs, are the primary reasons why individuals delay, avoid, or abandon health behaviour changes. They are subtle and conflicting emotions that sets a person in a state between positive and negative orientations, or between acceptance and refusal to do something. They manifest as a discord in affect between multiple modalities or within a modality, such as facial and vocal expressions, and body language. Although experts can be trained to recognize A/H as done for in-person interactions, integrating them into digital health interventions is costly and less effective. Automatic A/H recognition is therefore critical for the personalization and cost-effectiveness of digital behaviour change interventions. However, no datasets currently exist for the design of machine learning models to recognize A/H. This paper introduces the Behavioural Ambivalence/Hesitancy (BAH) dataset collected for multimodal recognition of A/H in videos. It contains 1,427 videos with a total duration of 10.60 hours, captured from 300 participants across Canada, answering predefined questions to elicit A/H. It is intended to mirror real-world digital behaviour change interventions delivered online. BAH is annotated by three experts to provide timestamps that indicate where A/H occurs, and frame- and video-level annotations with A/H cues. Video transcripts, cropped and aligned faces, and participant metadata are also provided. Since A and H manifest similarly in practice, we provide a binary annotation indicating the presence or absence of A/H. Additionally, this paper includes benchmarking results using baseline models on BAH for frame- and video-level recognition, zero-shot prediction, and personalization with source-free domain adaptation methods. The limited performance highlights the need for adapted multimodal and spatio-temporal models for A/H recognition. Results obtained with specialized fusion methods are shown to assess the presence of conflicts between modalities, additionally temporal modelling for within-modality conflicts are essential for more discriminant A/H recognition. The data, code, and pretrained weights are publicly available: https://github.com/LIVIAETS/bah-dataset.

Cite

Text

González-González et al. "BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for  Digital  Behavioural Change." International Conference on Learning Representations, 2026.

Markdown

[González-González et al. "BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for  Digital  Behavioural Change." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/gonzalezgonzalez2026iclr-bah/)

BibTeX

@inproceedings{gonzalezgonzalez2026iclr-bah,
  title     = {{BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for  Digital  Behavioural Change}},
  author    = {González-González, Manuela and Belharbi, Soufiane and Zeeshan, Muhammad Osama and Sharafi, Masoumeh and Aslam, Muhammad Haseeb and Pedersoli, Marco and Koerich, Alessandro Lameiras and Bacon, Simon L and Granger, Eric},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/gonzalezgonzalez2026iclr-bah/}
}