AgentTaxo: Dissecting and Benchmarking Token Distribution of LLM Multi-Agent Systems

Abstract

LLM-based multi-agent (LLM-MA) systems have demonstrated potential in complex tasks such as reasoning and code generation. However, compared to single-agent systems, LLM-MA systems incur significantly higher inference latency and token costs due to repeated LLM calls. In this work, we identify duplicated tokens as a major contributor to these inefficiencies, acting as a "communication tax" that hinders scalability. To systematically analyze token duplication patterns, we propose AgentTaxo, a taxonomy that categorizes agent roles into Planner, Reasoner, and Verifier across various applications. AgentTaxo dissects inter-agent communication and identifies redundant reasoning results frequently reused for validation. We benchmark and analyze token costs in popular LLM-MA systems, quantifying the impact of this communication tax through experimental evaluation. Our findings provide insights into optimizing efficiency and scalability in LLM-MA architectures.

Cite

Text

Wang et al. "AgentTaxo: Dissecting and Benchmarking Token Distribution of LLM Multi-Agent Systems." ICLR 2025 Workshops: FM-Wild, 2025.

Markdown

[Wang et al. "AgentTaxo: Dissecting and Benchmarking Token Distribution of LLM Multi-Agent Systems." ICLR 2025 Workshops: FM-Wild, 2025.](https://mlanthology.org/iclrw/2025/wang2025iclrw-agenttaxo/)

BibTeX

@inproceedings{wang2025iclrw-agenttaxo,
  title     = {{AgentTaxo: Dissecting and Benchmarking Token Distribution of LLM Multi-Agent Systems}},
  author    = {Wang, Qian and Tang, Zhenheng and Jiang, Zichen and Chen, Nuo and Wang, Tianyu and He, Bingsheng},
  booktitle = {ICLR 2025 Workshops: FM-Wild},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/wang2025iclrw-agenttaxo/}
}