TrackAny3D: Transferring Pretrained 3D Models for Category-Unified 3D Point Cloud Tracking

Abstract

3D LiDAR-based single object tracking (SOT) relies on sparse and irregular point clouds, posing challenges from geometric variations in scale, motion patterns, and structural complexity across object categories. Current category-specific approaches achieve good accuracy but are impractical for real-world use, requiring separate models for each category and showing limited generalization. To tackle these issues, we propose TrackAny3D, the first framework to transfer large-scale pretrained 3D models for category-agnostic 3D SOT. We first integrate parameter-efficient adapters to bridge the gap between pretraining and tracking tasks while preserving geometric priors. Then, we introduce a Mixture-of-Geometry-Experts (MoGE) architecture that adaptively activates specialized subnetworks based on distinct geometric characteristics. Additionally, we design a temporal context optimization strategy that incorporates learnable temporal tokens and a dynamic mask weighting module to propagate historical information and mitigate temporal drift. Experiments on three commonly-used benchmarks show that TrackAny3D establishes new state-of-the-art performance on category-agnostic 3D SOT, demonstrating strong generalization and competitiveness. We hope this work will enlighten the community on the importance of unified models and further expand the use of large-scale pretrained models in this field.

Cite

Text

Wang et al. "TrackAny3D: Transferring Pretrained 3D Models for Category-Unified 3D Point Cloud Tracking." International Conference on Computer Vision, 2025.

Markdown

[Wang et al. "TrackAny3D: Transferring Pretrained 3D Models for Category-Unified 3D Point Cloud Tracking." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/wang2025iccv-trackany3d/)

BibTeX

@inproceedings{wang2025iccv-trackany3d,
  title     = {{TrackAny3D: Transferring Pretrained 3D Models for Category-Unified 3D Point Cloud Tracking}},
  author    = {Wang, Mengmeng and Wang, Haonan and Li, Yulong and Kong, Xiangjie and Du, Jiaxin and Shen, Guojiang and Xia, Feng},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {28249-28259},
  url       = {https://mlanthology.org/iccv/2025/wang2025iccv-trackany3d/}
}