Unsupervised Storyline Extraction from News Articles

Abstract

Storyline extraction from news streams aims to extract events under a certain news topic and reveal how those events evolve over time. It requires algorithms capable of accurately extracting events from news articles published in different time periods and linking these extracted events into coherent stories. The two tasks are often solved separately, which might suffer from the problem of error propagation. Existing unified approaches often consider events as topics, ignoring their structured representations. In this paper, we propose a non-parametric generative model to extract structured representations and evolution patterns of storylines simultaneously. In the model, each storyline is modelled as a joint distribution over some locations, organizations, persons, keywords and a set of topics. We further combine this model with the Chinese restaurant process so that the number of storylines can be determined automatically without human intervention. Moreover, per-token Metropolis-Hastings sampler based on light latent Dirichlet allocation is employed to reduce sampling complexity. The proposed model has been evaluated on three news corpora and the experimental results show that it outperforms several baseline approaches. PDF

Cite

Text

Zhou et al. "Unsupervised Storyline Extraction from News Articles." International Joint Conference on Artificial Intelligence, 2016.

Markdown

[Zhou et al. "Unsupervised Storyline Extraction from News Articles." International Joint Conference on Artificial Intelligence, 2016.](https://mlanthology.org/ijcai/2016/zhou2016ijcai-unsupervised/)

BibTeX

@inproceedings{zhou2016ijcai-unsupervised,
  title     = {{Unsupervised Storyline Extraction from News Articles}},
  author    = {Zhou, Deyu and Xu, Haiyang and Dai, Xin-Yu and He, Yulan},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2016},
  pages     = {3014-3021},
  url       = {https://mlanthology.org/ijcai/2016/zhou2016ijcai-unsupervised/}
}