Int*-Match: Balancing Intra-Class Compactness and Inter-Class Discrepancy for Semi-Supervised Speaker Recognition

Abstract

Open-set speaker recognition is to identify whether the voices are from the same speaker. One challenge of speaker recognition is collecting large amounts of high-quality data. Based on the promising results of image classification, one intuitively feasible solution is semi-supervised learning (SSL) which uses confidence thresholds to assign pseudo labels for unlabeled data. However, we empirically demonstrated that applying SSL methods to speaker recognition is non-trivial. These methods focus solely on inter-class discrepancy as thresholds to select pseudo labels, overlooking intra-class compactness, which is particularly important for open-set speaker recognition tasks. Motivated by this, we propose Int*-Match, a semi-supervised speaker recognition method selecting reliable pseudo labels with intra-class compactness and inter-class discrepancy for speaker recognition. In particular, we use the inter-class discrepancy of labeled data as the threshold for pseudo-label selection and adjust the threshold based on the intra-class compactness of the pseudo labels dynamically and adaptively. Our systematic experiments demonstrate the superiority of Int*-Match, presenting an outstanding Equal Error Rate (EER) of 1.00% on the VoxCeleb1 original test set, which is merely 0.06% below the performance achieved by fully supervised learning.

Cite

Text

Wang et al. "Int*-Match: Balancing Intra-Class Compactness and Inter-Class Discrepancy for Semi-Supervised Speaker Recognition." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I24.34729

Markdown

[Wang et al. "Int*-Match: Balancing Intra-Class Compactness and Inter-Class Discrepancy for Semi-Supervised Speaker Recognition." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/wang2025aaai-int/) doi:10.1609/AAAI.V39I24.34729

BibTeX

@inproceedings{wang2025aaai-int,
  title     = {{Int*-Match: Balancing Intra-Class Compactness and Inter-Class Discrepancy for Semi-Supervised Speaker Recognition}},
  author    = {Wang, Xingmei and Liu, Jinghan and Meng, Jiaxiang and Li, Boquan and Liu, Zijian},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {25407-25415},
  doi       = {10.1609/AAAI.V39I24.34729},
  url       = {https://mlanthology.org/aaai/2025/wang2025aaai-int/}
}