Promoting the Responsible Development of Speech Datasets for Mental Health and Neurological Disorders Research

Mancini, Eleonora; Tanevska, Ana; Galassi, Andrea; Galatolo, Alessio; Ruggeri, Federico; Torroni, Paolo

doi:10.1613/JAIR.1.16406

Promoting the Responsible Development of Speech Datasets for Mental Health and Neurological Disorders Research

Eleonora Mancini, Ana Tanevska, Andrea Galassi, Alessio Galatolo, Federico Ruggeri, Paolo Torroni

JAIR 2025 pp. 937-972

doi:10.1613/JAIR.1.16406 /jair/2025/mancini2025jair-promoting/

Abstract

Current research in machine learning and artificial intelligence is largely centered on modeling and performance evaluation, less so on data collection. However, recent research demonstrated that limitations and biases in data may negatively impact trustworthiness and reliability. These aspects are particularly impactful on sensitive domains such as mental health and neurological disorders, where speech data are used to develop AI applications for patients and healthcare providers. In this paper, we chart the landscape of available speech datasets for this domain, to highlight possible pitfalls and opportunities for improvement and promote fairness and diversity. We present a comprehensive list of desiderata for building speech datasets for mental health and neurological disorders and distill it into an actionable checklist focused on ethical concerns to foster more responsible research.

PDF JAIR Semantic Scholar

Cite

Text

Mancini et al. "Promoting the Responsible Development of Speech Datasets for Mental Health and Neurological Disorders Research." Journal of Artificial Intelligence Research, 2025. doi:10.1613/JAIR.1.16406

Markdown

[Mancini et al. "Promoting the Responsible Development of Speech Datasets for Mental Health and Neurological Disorders Research." Journal of Artificial Intelligence Research, 2025.](https://mlanthology.org/jair/2025/mancini2025jair-promoting/) doi:10.1613/JAIR.1.16406

BibTeX

@article{mancini2025jair-promoting,
  title     = {{Promoting the Responsible Development of Speech Datasets for Mental Health and Neurological Disorders Research}},
  author    = {Mancini, Eleonora and Tanevska, Ana and Galassi, Andrea and Galatolo, Alessio and Ruggeri, Federico and Torroni, Paolo},
  journal   = {Journal of Artificial Intelligence Research},
  year      = {2025},
  pages     = {937-972},
  doi       = {10.1613/JAIR.1.16406},
  volume    = {82},
  url       = {https://mlanthology.org/jair/2025/mancini2025jair-promoting/}
}