BeyondGender: A Multifaceted Bilingual Dataset for Practical Sexism Detection

Luo, Xuan; Yang, Li; Zhang, Han; Tu, Geng; Wang, Qianlong; Ding, Keyang; Fan, Chuang; Li, Jing; Xu, Ruifeng

doi:10.1609/AAAI.V39I23.34656

BeyondGender: A Multifaceted Bilingual Dataset for Practical Sexism Detection

Xuan Luo, Li Yang, Han Zhang, Geng Tu, Qianlong Wang, Keyang Ding, Chuang Fan, Jing Li, Ruifeng Xu

AAAI 2025 pp. 24750-24758

doi:10.1609/AAAI.V39I23.34656 /aaai/2025/luo2025aaai-beyondgender/

Abstract

Sexism affects both women and men, yet research often overlooks misandry and suffers from overly broad annotations that limit AI applications. To address this, we introduce BeyondGender, a dataset meticulously annotated according to the latest definitions of misogyny and misandry. It features innovative multifaceted labels encompassing aspects of sexism, gender, phrasing, misogyny, and misandry. The dataset includes 6K English and 1.7K Chinese sexism instances, alongside 13K non-sexism examples. Our evaluations of masked language models and large language models reveal that they detect misogyny in English and misandry in Chinese more effectively, with F1-scores of 0.87 and 0.62, respectively. However, they frequently misclassify hostile and mild comments, underscoring the complexity of sexism detection. Parallel corpus experiments suggest promising data augmentation strategies to enhance AI systems for nuanced sexism detection, and our dataset can be leveraged to improve value alignment in large language models.

PDF AAAI Semantic Scholar

Cite

Text

Luo et al. "BeyondGender: A Multifaceted Bilingual Dataset for Practical Sexism Detection." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I23.34656

Markdown

[Luo et al. "BeyondGender: A Multifaceted Bilingual Dataset for Practical Sexism Detection." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/luo2025aaai-beyondgender/) doi:10.1609/AAAI.V39I23.34656

BibTeX

@inproceedings{luo2025aaai-beyondgender,
  title     = {{BeyondGender: A Multifaceted Bilingual Dataset for Practical Sexism Detection}},
  author    = {Luo, Xuan and Yang, Li and Zhang, Han and Tu, Geng and Wang, Qianlong and Ding, Keyang and Fan, Chuang and Li, Jing and Xu, Ruifeng},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {24750-24758},
  doi       = {10.1609/AAAI.V39I23.34656},
  url       = {https://mlanthology.org/aaai/2025/luo2025aaai-beyondgender/}
}