Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence
Abstract
Language models lack the notion of interchangeable tokens: symbols that are semantically equivalent yet distinct, such as bound variables in formal logic. This limitation prevents generalization to larger vocabularies and hinders the model’s ability to recognize alpha-equivalence, where renaming bound variables preserves meaning. We formalize this machine learning problem and introduce alpha-covariance, a metric for evaluating robustness to such transformations. To tackle this task, we propose a dual-part token embedding strategy: a shared component ensures semantic consistency, while a randomized component maintains token distinguishability. Compared to a baseline that relies on alpha-renaming for data augmentation, our approach demonstrates improved generalization to unseen tokens in linear temporal logic solving, propositional logic assignment prediction, and copying with an extendable vocabulary, while introducing a favorable inductive bias for alpha-equivalence. Our findings establish a foundation for designing language models that can learn interchangeable token representations, a crucial step toward more flexible and systematic reasoning in formal domains. Our code and project page are available at https://necrashter.github.io/interchangeable-token-embeddings
Cite
Text
Işık et al. "Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Işık et al. "Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/isk2025icml-interchangeable/)BibTeX
@inproceedings{isk2025icml-interchangeable,
title = {{Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence}},
author = {Işık, İlker and Cinbis, Ramazan Gokberk and Gol, Ebru Aydin},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {26523-26541},
volume = {267},
url = {https://mlanthology.org/icml/2025/isk2025icml-interchangeable/}
}