SpARK: An Embarrassingly Simple Sparse Watermarking in LLMs with Enhanced Text Quality
Abstract
With the widespread adoption of Large Language Models (LLMs), concerns about potential misuse have emerged. To this end, watermarking has been adapted to LLM, enabling a simple and effective way to detect and monitor generated text. However, while the existing methods can differentiate between watermarked and unwatermarked text with high accuracy, they often face a trade-off between the quality of the generated text and the effectiveness of the watermarking process. In this work, we present a novel type of LLM watermark, *Sparse Watermark*, which aims to mitigate this trade-off by applying watermarks to a small subset of generated tokens distributed across the text. To demonstrate this type of watermark, we introduce **SpARK**, a **Sp**arse Waterm**ARK** method that achieves sparsity by anchoring watermarked tokens to words that have specific Part-of-Speech (POS) tags. Our experimental results demonstrate that the proposed watermarking scheme, albeit *embarrassingly simple*, is *incredibly effective*, achieving high detectability while generating text that outperforms previous LLM watermarking methods in quality across various tasks.
Cite
Text
Hoang et al. "SpARK: An Embarrassingly Simple Sparse Watermarking in LLMs with Enhanced Text Quality." ICLR 2025 Workshops: WMARK, 2025.Markdown
[Hoang et al. "SpARK: An Embarrassingly Simple Sparse Watermarking in LLMs with Enhanced Text Quality." ICLR 2025 Workshops: WMARK, 2025.](https://mlanthology.org/iclrw/2025/hoang2025iclrw-spark/)BibTeX
@inproceedings{hoang2025iclrw-spark,
title = {{SpARK: An Embarrassingly Simple Sparse Watermarking in LLMs with Enhanced Text Quality}},
author = {Hoang, Duy Cao and Le, Thanh Quoc Hung and Chu, Rui and Li, Ping and Zhao, Weijie and Lao, Yingjie and Doan, Khoa D},
booktitle = {ICLR 2025 Workshops: WMARK},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/hoang2025iclrw-spark/}
}