ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection
Abstract
Auditory attention detection (AAD) aims to identify the direction of the attended speaker in multi-speaker environments from brain signals, such as Electroencephalography (EEG) signals. However, existing EEG-based AAD methods overlook the spatio-temporal dependencies of EEG signals, limiting their decoding and generalization abilities. To address these issues, this paper proposes a Lightweight Spatio-Temporal Enhancement Nested Network (ListenNet) for AAD. The ListenNet has three key components: Spatio-temporal Dependency Encoder (STDE), Multi-scale Temporal Enhancement (MSTE), and Cross-Nested Attention (CNA). The STDE reconstructs dependencies between consecutive time windows across channels, improving the robustness of dynamic pattern extraction. The MSTE captures temporal features at multiple scales to represent both fine-grained and long-range temporal patterns. In addition, the CNA integrates hierarchical features more effectively through novel dynamic attention mechanisms to capture deep spatio-temporal correlations. Experimental results on three public datasets demonstrate the superiority of ListenNet over state-of-the-art methods in both subject-dependent and challenging subject-independent settings, while reducing the trainable parameter count by approximately 7 times. Code is available at:https://github.com/fchest/ListenNet.
Cite
Text
Fan et al. "ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/461Markdown
[Fan et al. "ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/fan2025ijcai-listennet/) doi:10.24963/IJCAI.2025/461BibTeX
@inproceedings{fan2025ijcai-listennet,
title = {{ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection}},
author = {Fan, Cunhang and Yang, Xiaoke and Zhang, Hongyu and Chen, Ying and Li, Lu and Zhou, Jian and Lv, Zhao},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2025},
pages = {4137-4145},
doi = {10.24963/IJCAI.2025/461},
url = {https://mlanthology.org/ijcai/2025/fan2025ijcai-listennet/}
}