A Watermark for Order-Agnostic Language Models

Abstract

Statistical watermarking techniques are well-established for sequentially decoded language models (LMs). However, these techniques cannot be directly applied to order-agnostic LMs, as the tokens in order-agnostic LMs are not generated sequentially. In this work, we introduce PATTERN-MARK, a pattern-based watermarking framework specifically designed for order-agnostic LMs. We develop a Markov-chain-based watermark generator that produces watermark key sequences with high-frequency key patterns. Correspondingly, we propose a statistical pattern-based detection algorithm that recovers the key sequence during detection and conducts statistical tests based on the count of high-frequency patterns. Our extensive evaluations on order-agnostic LMs, such as ProteinMPNN and CMLM, demonstrate PATTERN-MARK’s enhanced detection efficiency, generation quality, and robustness, positioning it as a superior watermarking technique for order-agnostic LMs.

Cite

Text

Chen et al. "A Watermark for Order-Agnostic Language Models." International Conference on Learning Representations, 2025.

Markdown

[Chen et al. "A Watermark for Order-Agnostic Language Models." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/chen2025iclr-watermark/)

BibTeX

@inproceedings{chen2025iclr-watermark,
  title     = {{A Watermark for Order-Agnostic Language Models}},
  author    = {Chen, Ruibo and Wu, Yihan and Chen, Yanshuo and Liu, Chenxi and Guo, Junfeng and Huang, Heng},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/chen2025iclr-watermark/}
}