LLM Fingerprinting via Semantically Conditioned Watermarks
Abstract
Most LLM fingerprinting methods teach the model to respond to a few fixed queries with predefined atypical responses (keys). This memorization often does not survive common deployment steps such as finetuning or quantization, and such keys can be easily detected and filtered from LLM responses, ultimately breaking the fingerprint. To overcome these limitations we introduce *LLM fingerprinting via semantically conditioned watermarks*, replacing fixed query sets with a broad semantic domain, and replacing brittle atypical keys with a statistical watermarking signal diffused throughout each response. After teaching the model to watermark its responses only to prompts from a predetermined domain e.g., French language, the model owner can use queries from that domain to reliably detect the fingerprint and verify ownership. As we confirm in our thorough experimental evaluation, our fingerprint is both stealthy and robust to all common deployment scenarios.
Cite
Text
Gloaguen et al. "LLM Fingerprinting via Semantically Conditioned Watermarks." International Conference on Learning Representations, 2026.Markdown
[Gloaguen et al. "LLM Fingerprinting via Semantically Conditioned Watermarks." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/gloaguen2026iclr-llm/)BibTeX
@inproceedings{gloaguen2026iclr-llm,
title = {{LLM Fingerprinting via Semantically Conditioned Watermarks}},
author = {Gloaguen, Thibaud and Staab, Robin and Jovanović, Nikola and Vechev, Martin},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/gloaguen2026iclr-llm/}
}