Statistics-Based Summarization - Step One: Sentence Compression

Abstract

When humans produce summaries of documents, they do not simply extract sentences and concatenate them. Rather, they create new sentences that are grammatical, that cohere with one another, and that capture the most salient pieces of information in the original document. Given that large collections of text/abstract pairs are available online, it is now possible to envision algorithms that are trained to mimic this process. In this paper, we focus on sentence compression, a simpler version of this larger challenge. We aim to achieve two goals simultaneously: our compressions should be grammatical, and they should retain the most important pieces of information. These two goals can conflict. We devise both noisy-channel and decision-tree approaches to the problem, and we evaluate results against manual compressions and a simple baseline. Introduction Most of the research in automatic summarization has focused on extraction, i.e., on identifying the most important claus...

Cite

Text

Knight and Marcu. "Statistics-Based Summarization - Step One: Sentence Compression." AAAI Conference on Artificial Intelligence, 2000.

Markdown

[Knight and Marcu. "Statistics-Based Summarization - Step One: Sentence Compression." AAAI Conference on Artificial Intelligence, 2000.](https://mlanthology.org/aaai/2000/knight2000aaai-statistics/)

BibTeX

@inproceedings{knight2000aaai-statistics,
  title     = {{Statistics-Based Summarization - Step One: Sentence Compression}},
  author    = {Knight, Kevin and Marcu, Daniel},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2000},
  pages     = {703-710},
  url       = {https://mlanthology.org/aaai/2000/knight2000aaai-statistics/}
}