Self-Paced Compensatory Deep Boltzmann Machine for Semi-Structured Document Embedding

Abstract

In the last decade, there has been a huge amount of documents with different types of rich metadata information, which belongs to the Semi-Structured Documents (SSDs), appearing in many real applications. It is an interesting research work to model this type of text data following the way how humans understand text with informative metadata. In the paper, we introduce a Self-paced Compensatory Deep Boltzmann Machine (SCDBM) architecture that learns a deep neural network by using metadata information to learn deep structure layer-wisely for Semi-Structured Documents (SSDs) embedding in a self-paced way. Inspired by the way how humans understand text, the model defines a deep process of document vector extraction beyond the space of words by jointing the metadata where each layer selects different types of metadata. We present efficient learning and inference algorithms for the SCDBM model and empirically demonstrate that using the representation discovered by this model has better performance on semi-structured document classification and retrieval, and tag prediction comparing with state-of-the-art baselines.

Cite

Text

Li et al. "Self-Paced Compensatory Deep Boltzmann Machine for Semi-Structured Document Embedding." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/304

Markdown

[Li et al. "Self-Paced Compensatory Deep Boltzmann Machine for Semi-Structured Document Embedding." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/li2017ijcai-self-a/) doi:10.24963/IJCAI.2017/304

BibTeX

@inproceedings{li2017ijcai-self-a,
  title     = {{Self-Paced Compensatory Deep Boltzmann Machine for Semi-Structured Document Embedding}},
  author    = {Li, Shuangyin and Pan, Rong and Yan, Jun},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {2187-2193},
  doi       = {10.24963/IJCAI.2017/304},
  url       = {https://mlanthology.org/ijcai/2017/li2017ijcai-self-a/}
}