Transformers Are Provably Optimal In-Context Estimators for Wireless Communications

Abstract

Pre-trained transformers exhibit the capability of adapting to new tasks through in-context learning (ICL), where they efficiently utilize a limited set of prompts without explicit model optimization. The canonical communication problem of estimating transmitted symbols from received observations can be modeled as an in-context learning problem: Received observations are a noisy function of transmitted symbols, and this function can be represented by an unknown parameter whose statistics depend on an unknown latent context. This problem, which we term in-context estimation (ICE), has significantly greater complexity than the extensively studied linear regression problem. The optimal solution to the ICE problem is a non-linear function of the underlying context. In this paper, we prove that, for a subclass of such problems, a single-layer softmax attention transformer (SAT) computes the optimal solution of the above estimation problem in the limit of large prompt length. We also prove that the optimal configuration of such a transformer is indeed the minimizer of the corresponding training loss. Further, we empirically demonstrate the proficiency of multi-layer transformers in efficiently solving broader in-context estimation problems. Through extensive simulations, we show that solving ICE problems using transformers significantly outperforms standard approaches. Moreover, just with a few context examples, it achieves the same performance as an estimator with perfect knowledge of the latent context.

Cite

Text

Kunde et al. "Transformers Are Provably Optimal In-Context Estimators for Wireless Communications." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.

Markdown

[Kunde et al. "Transformers Are Provably Optimal In-Context Estimators for Wireless Communications." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.](https://mlanthology.org/aistats/2025/kunde2025aistats-transformers/)

BibTeX

@inproceedings{kunde2025aistats-transformers,
  title     = {{Transformers Are Provably Optimal In-Context Estimators for Wireless Communications}},
  author    = {Kunde, Vishnu Teja and Rajagopalan, Vicram and Valmeekam, Chandra Shekhara Kaushik and Narayanan, Krishna and Chamberland, Jean-Francois and Kalathil, Dileep and Shakkottai, Srinivas},
  booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
  year      = {2025},
  pages     = {1531-1539},
  volume    = {258},
  url       = {https://mlanthology.org/aistats/2025/kunde2025aistats-transformers/}
}