Sequence-to-Sequence Learning via Shared Latent Representation
Abstract
Sequence-to-sequence learning is a popular research area in deep learning, such as video captioning and speech recognition. Existing methods model this learning as a mapping process by first encoding the input sequence to a fixed-sized vector, followed by decoding the target sequence from the vector. Although simple and intuitive, such mapping model is task-specific, unable to be directly used for different tasks. In this paper, we propose a star-like framework for general and flexible sequence-to-sequence learning, where different types of media contents (the peripheral nodes) could be encoded to and decoded from a shared latent representation (SLR) (the central node). This is inspired by the fact that human brain could learn and express an abstract concept in different ways. The media-invariant property of SLR could be seen as a high-level regularization on the intermediate vector, enforcing it to not only capture the latent representation intra each individual media like the auto-encoders, but also their transitions like the mapping models. Moreover, the SLR model is content-specific, which means it only needs to be trained once for a dataset, while used for different tasks. We show how to train a SLR model via dropout and use it for different sequence-to-sequence tasks. Our SLR model is validated on the Youtube2Text and MSR-VTT datasets, achieving superior performance on video-to-sentence task, and the first sentence-to-video results.
Cite
Text
Shen et al. "Sequence-to-Sequence Learning via Shared Latent Representation." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.11837Markdown
[Shen et al. "Sequence-to-Sequence Learning via Shared Latent Representation." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/shen2018aaai-sequence/) doi:10.1609/AAAI.V32I1.11837BibTeX
@inproceedings{shen2018aaai-sequence,
title = {{Sequence-to-Sequence Learning via Shared Latent Representation}},
author = {Shen, Xu and Tian, Xinmei and Xing, Jun and Rui, Yong and Tao, Dacheng},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2018},
pages = {2395-2402},
doi = {10.1609/AAAI.V32I1.11837},
url = {https://mlanthology.org/aaai/2018/shen2018aaai-sequence/}
}