CognitionCapturer: Decoding Visual Stimuli from Human EEG Signal with Multimodal Information

Zhang, Kaifan; He, Lihuo; Jiang, Xin; Lu, Wen; Wang, Di; Gao, Xinbo

doi:10.1609/AAAI.V39I13.33587

CognitionCapturer: Decoding Visual Stimuli from Human EEG Signal with Multimodal Information

Kaifan Zhang, Lihuo He, Xin Jiang, Wen Lu, Di Wang, Xinbo Gao

AAAI 2025 pp. 14486-14493

doi:10.1609/AAAI.V39I13.33587 /aaai/2025/zhang2025aaai-cognitioncapturer/

Abstract

Electroencephalogram (EEG) signals have attracted significant attention from researchers due to their non-invasive nature and high temporal sensitivity in decoding visual stimuli. However, most recent studies have focused solely on the relationship between EEG and image data pairs, neglecting the valuable "beyond-image-modality" information embedded in EEG signals. This results in the loss of critical multimodal information in EEG. To address the limitation, this paper proposes a unified framework that fully leverages multimodal data to represent EEG signals, named CognitionCapturer. Specifically, CognitionCapturer trains modality expert encoders for each modality to extract cross-modal information from the EEG modality. Then, it introduces a diffusion prior to map the EEG embedding space to the CLIP embedding space, followed by using a pretrained generative model, the proposed framework can reconstruct visual stimuli with high semantic and structural fidelity. Notably, the framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities. Through extensive experiments, we demonstrate that CognitionCapturer outperforms state-of-the-art methods both qualitatively and quantitatively.

PDF AAAI Semantic Scholar

Cite

Text

Zhang et al. "CognitionCapturer: Decoding Visual Stimuli from Human EEG Signal with Multimodal Information." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I13.33587

Markdown

[Zhang et al. "CognitionCapturer: Decoding Visual Stimuli from Human EEG Signal with Multimodal Information." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/zhang2025aaai-cognitioncapturer/) doi:10.1609/AAAI.V39I13.33587

BibTeX

@inproceedings{zhang2025aaai-cognitioncapturer,
  title     = {{CognitionCapturer: Decoding Visual Stimuli from Human EEG Signal with Multimodal Information}},
  author    = {Zhang, Kaifan and He, Lihuo and Jiang, Xin and Lu, Wen and Wang, Di and Gao, Xinbo},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {14486-14493},
  doi       = {10.1609/AAAI.V39I13.33587},
  url       = {https://mlanthology.org/aaai/2025/zhang2025aaai-cognitioncapturer/}
}