ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection

Fan, Cunhang; Yang, Xiaoke; Zhang, Hongyu; Chen, Ying; Li, Lu; Zhou, Jian; Lv, Zhao

doi:10.24963/IJCAI.2025/461

ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection

Cunhang Fan, Xiaoke Yang, Hongyu Zhang, Ying Chen, Lu Li, Jian Zhou, Zhao Lv

IJCAI 2025 pp. 4137-4145

doi:10.24963/IJCAI.2025/461 /ijcai/2025/fan2025ijcai-listennet/

Abstract

Auditory attention detection (AAD) aims to identify the direction of the attended speaker in multi-speaker environments from brain signals, such as Electroencephalography (EEG) signals. However, existing EEG-based AAD methods overlook the spatio-temporal dependencies of EEG signals, limiting their decoding and generalization abilities. To address these issues, this paper proposes a Lightweight Spatio-Temporal Enhancement Nested Network (ListenNet) for AAD. The ListenNet has three key components: Spatio-temporal Dependency Encoder (STDE), Multi-scale Temporal Enhancement (MSTE), and Cross-Nested Attention (CNA). The STDE reconstructs dependencies between consecutive time windows across channels, improving the robustness of dynamic pattern extraction. The MSTE captures temporal features at multiple scales to represent both fine-grained and long-range temporal patterns. In addition, the CNA integrates hierarchical features more effectively through novel dynamic attention mechanisms to capture deep spatio-temporal correlations. Experimental results on three public datasets demonstrate the superiority of ListenNet over state-of-the-art methods in both subject-dependent and challenging subject-independent settings, while reducing the trainable parameter count by approximately 7 times. Code is available at:https://github.com/fchest/ListenNet.

PDF IJCAI Semantic Scholar

Cite

Text

Fan et al. "ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/461

Markdown

[Fan et al. "ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/fan2025ijcai-listennet/) doi:10.24963/IJCAI.2025/461

BibTeX

@inproceedings{fan2025ijcai-listennet,
  title     = {{ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection}},
  author    = {Fan, Cunhang and Yang, Xiaoke and Zhang, Hongyu and Chen, Ying and Li, Lu and Zhou, Jian and Lv, Zhao},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {4137-4145},
  doi       = {10.24963/IJCAI.2025/461},
  url       = {https://mlanthology.org/ijcai/2025/fan2025ijcai-listennet/}
}