Transformers Are Provably Optimal In-Context Estimators for Wireless Communications
Abstract
Pre-trained transformers exhibit the capability of adapting to new tasks through in-context learning (ICL), where they efficiently utilize a limited set of prompts without explicit model optimization. The canonical communication problem of estimating transmitted symbols from received observations can be modeled as an in-context learning problem: Received observations are a noisy function of transmitted symbols, and this function can be represented by an unknown parameter whose statistics depend on an unknown latent context. This problem, which we term in-context estimation (ICE), has significantly greater complexity than the extensively studied linear regression problem. The optimal solution to the ICE problem is a non-linear function of the underlying context. In this paper, we prove that, for a subclass of such problems, a single-layer softmax attention transformer (SAT) computes the optimal solution of the above estimation problem in the limit of large prompt length. We also prove that the optimal configuration of such a transformer is indeed the minimizer of the corresponding training loss. Further, we empirically demonstrate the proficiency of multi-layer transformers in efficiently solving broader in-context estimation problems. Through extensive simulations, we show that solving ICE problems using transformers significantly outperforms standard approaches. Moreover, just with a few context examples, it achieves the same performance as an estimator with perfect knowledge of the latent context.
Cite
Text
Kunde et al. "Transformers Are Provably Optimal In-Context Estimators for Wireless Communications." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.Markdown
[Kunde et al. "Transformers Are Provably Optimal In-Context Estimators for Wireless Communications." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.](https://mlanthology.org/aistats/2025/kunde2025aistats-transformers/)BibTeX
@inproceedings{kunde2025aistats-transformers,
title = {{Transformers Are Provably Optimal In-Context Estimators for Wireless Communications}},
author = {Kunde, Vishnu Teja and Rajagopalan, Vicram and Valmeekam, Chandra Shekhara Kaushik and Narayanan, Krishna and Chamberland, Jean-Francois and Kalathil, Dileep and Shakkottai, Srinivas},
booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
year = {2025},
pages = {1531-1539},
volume = {258},
url = {https://mlanthology.org/aistats/2025/kunde2025aistats-transformers/}
}