Sqn2Vec: Learning Sequence Representation via Sequential Patterns with a Gap Constraint

Abstract

When learning sequence representations, traditional pattern-based methods often suffer from the data sparsity and high-dimensionality problems while recent neural embedding methods often fail on sequential datasets with a small vocabulary. To address these disadvantages, we propose an unsupervised method (named Sqn2Vec ) which first leverages sequential patterns (SPs) to increase the vocabulary size and then learns low-dimensional continuous vectors for sequences via a neural embedding model. Moreover, our method enforces a gap constraint among symbols in sequences to obtain meaningful and discriminative SPs. Consequently, Sqn2Vec produces significantly better sequence representations than a comprehensive list of state-of-the-art baselines, particularly on sequential datasets with a relatively small vocabulary. We demonstrate the superior performance of Sqn2Vec in several machine learning tasks including sequence classification, clustering, and visualization.

Cite

Text

Nguyen et al. "Sqn2Vec: Learning Sequence Representation via Sequential Patterns with a Gap Constraint." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2018. doi:10.1007/978-3-030-10928-8_34

Markdown

[Nguyen et al. "Sqn2Vec: Learning Sequence Representation via Sequential Patterns with a Gap Constraint." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2018.](https://mlanthology.org/ecmlpkdd/2018/nguyen2018ecmlpkdd-sqn2vec/) doi:10.1007/978-3-030-10928-8_34

BibTeX

@inproceedings{nguyen2018ecmlpkdd-sqn2vec,
  title     = {{Sqn2Vec: Learning Sequence Representation via Sequential Patterns with a Gap Constraint}},
  author    = {Nguyen, Dang and Luo, Wei and Nguyen, Tu Dinh and Venkatesh, Svetha and Phung, Dinh Q.},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2018},
  pages     = {569-584},
  doi       = {10.1007/978-3-030-10928-8_34},
  url       = {https://mlanthology.org/ecmlpkdd/2018/nguyen2018ecmlpkdd-sqn2vec/}
}