Audio Feature Learning with Triplet-Based Embedding Network

Abstract

We propose a triplet-based network for audio feature learning for version identification. Existing methods use hand-crafted features for a music as a whole while we learn features by a triplet-based neural network on segment-level, focusing on the most similar parts between music versions. We conduct extensive experiments and demonstrate our merits.

Cite

Text

Qi et al. "Audio Feature Learning with Triplet-Based Embedding Network." AAAI Conference on Artificial Intelligence, 2017. doi:10.1609/AAAI.V31I1.11071

Markdown

[Qi et al. "Audio Feature Learning with Triplet-Based Embedding Network." AAAI Conference on Artificial Intelligence, 2017.](https://mlanthology.org/aaai/2017/qi2017aaai-audio/) doi:10.1609/AAAI.V31I1.11071

BibTeX

@inproceedings{qi2017aaai-audio,
  title     = {{Audio Feature Learning with Triplet-Based Embedding Network}},
  author    = {Qi, Xiaoyu and Yang, Deshun and Chen, Xiaoou},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {4979-4980},
  doi       = {10.1609/AAAI.V31I1.11071},
  url       = {https://mlanthology.org/aaai/2017/qi2017aaai-audio/}
}