SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance

Zhou, Xinchi; Zhou, Dongzhan; Ouyang, Wanli; Zhou, Hang; Hu, Di

SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance

Xinchi Zhou, Dongzhan Zhou, Wanli Ouyang, Hang Zhou, Di Hu

WACV 2023 pp. 5168-5177

/wacv/2023/zhou2023wacv-seco/

Abstract

Recent years have witnessed the success of deep learning on the visual sound separation task. However, existing works follow similar settings where the training and testing datasets share the same musical instrument categories, which to some extent limits the versatility of this task. In this work, we focus on a more general and challenging scenario, namely the separation of unknown musical instruments, where the categories in training and testing phases have no overlap with each other. To tackle this new setting, we propose the "Separation-with-Consistency" (SeCo) framework, which can accomplish the separation on unknown categories by exploiting the consistency constraints. Furthermore, to capture richer characteristics of the novel melodies, we devise an online matching strategy, which can bring stable enhancements with no cost of extra parameters. Experiments demonstrate that our SeCo framework exhibits strong adaptation ability on the novel musical categories and outperforms the baseline methods by a notable margin.

PDF WACV Semantic Scholar

Cite

Text

Zhou et al. "SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance." Winter Conference on Applications of Computer Vision, 2023.

Markdown

[Zhou et al. "SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance." Winter Conference on Applications of Computer Vision, 2023.](https://mlanthology.org/wacv/2023/zhou2023wacv-seco/)

BibTeX

@inproceedings{zhou2023wacv-seco,
  title     = {{SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance}},
  author    = {Zhou, Xinchi and Zhou, Dongzhan and Ouyang, Wanli and Zhou, Hang and Hu, Di},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2023},
  pages     = {5168-5177},
  url       = {https://mlanthology.org/wacv/2023/zhou2023wacv-seco/}
}