Not All Tokens Matter All the Time: Dynamic Token Aggregation Towards Efficient Detection Transformers

Cheng, Jiacheng; Yao, Xiwen; Yuan, Xiang; Han, Junwei

Not All Tokens Matter All the Time: Dynamic Token Aggregation Towards Efficient Detection Transformers

Jiacheng Cheng, Xiwen Yao, Xiang Yuan, Junwei Han

ICML 2025 pp. 10144-10158

/icml/2025/cheng2025icml-all/

Abstract

The substantial computational demands of detection transformers (DETRs) hinder their deployment in resource-constrained scenarios, with the encoder consistently emerging as a critical bottleneck. A promising solution lies in reducing token redundancy within the encoder. However, existing methods perform static sparsification while ignoring the varying importance of tokens across different levels and encoder blocks for object detection, leading to suboptimal sparsification and performance degradation. In this paper, we propose Dynamic DETR (Dynamic token aggregation for DEtection TRansformers), a novel strategy that leverages inherent importance distribution to control token density and performs multi-level token sparsification. Within each stage, we apply a proximal aggregation paradigm for low-level tokens to maintain spatial integrity, and a holistic strategy for high-level tokens to capture broader contextual information. Furthermore, we propose center-distance regularization to align the distribution of tokens throughout the sparsification process, thereby facilitating the representation consistency and effectively preserving critical object-specific patterns. Extensive experiments on canonical DETR models demonstrate that Dynamic DETR is broadly applicable across various models and consistently outperforms existing token sparsification methods.

PDF ICML OpenReview Semantic Scholar

Cite

Text

Cheng et al. "Not All Tokens Matter All the Time: Dynamic Token Aggregation Towards Efficient Detection Transformers." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Cheng et al. "Not All Tokens Matter All the Time: Dynamic Token Aggregation Towards Efficient Detection Transformers." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/cheng2025icml-all/)

BibTeX

@inproceedings{cheng2025icml-all,
  title     = {{Not All Tokens Matter All the Time: Dynamic Token Aggregation Towards Efficient Detection Transformers}},
  author    = {Cheng, Jiacheng and Yao, Xiwen and Yuan, Xiang and Han, Junwei},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {10144-10158},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/cheng2025icml-all/}
}