Automatic Summarization of Long Documents (Student Abstract)
Abstract
A vast amount of textual data is added to the internet daily, making utilization and interpretation of textual data difficult and cumbersome. As a result, automatic text summarization is crucial for extracting relevant information, saving precious time. Although many transformer models excel in summarization, they are constrained by their input size, preventing them from processing texts longer than their context size. This study introduces several novel algorithms that allow any LLM to efficiently overcome its input size limitation, effectively utilizing its full potential without any architectural modifications. We test our algorithms on texts with more than 70,000 words, and our experiments show a significant increase in BERTScore with competitive ROUGE scores.
Cite
Text
Chhibbar and Kalita. "Automatic Summarization of Long Documents (Student Abstract)." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I28.35243Markdown
[Chhibbar and Kalita. "Automatic Summarization of Long Documents (Student Abstract)." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/chhibbar2025aaai-automatic/) doi:10.1609/AAAI.V39I28.35243BibTeX
@inproceedings{chhibbar2025aaai-automatic,
title = {{Automatic Summarization of Long Documents (Student Abstract)}},
author = {Chhibbar, Naman and Kalita, Jugal},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {29337-29339},
doi = {10.1609/AAAI.V39I28.35243},
url = {https://mlanthology.org/aaai/2025/chhibbar2025aaai-automatic/}
}