TTT3R: 3D Reconstruction as Test-Time Training

Abstract

Modern Recurrent Neural Networks have become a competitive architecture for 3D reconstruction due to their linear-time complexity. However, their performance degrades significantly when applied beyond the training context length, revealing limited length generalization. In this work, we revisit the 3D reconstruction foundation models from a Test-Time Training perspective, framing their designs as an online learning problem. Building on this perspective, we leverage the alignment confidence between the memory state and incoming observations to derive a closed-form learning rate for memory updates, to balance between retaining historical information and adapting to new observations. This training-free intervention, termed TTT3R, substantially improves length generalization, achieving a 2 $\times$ improvement in global pose estimation over baselines, while operating at 20 FPS with just 6 GB of GPU memory to process thousands of images. Code is available in https://rover-xingyu.github.io/TTT3R.

Cite

Text

Chen et al. "TTT3R: 3D Reconstruction as Test-Time Training." International Conference on Learning Representations, 2026.

Markdown

[Chen et al. "TTT3R: 3D Reconstruction as Test-Time Training." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/chen2026iclr-ttt3r/)

BibTeX

@inproceedings{chen2026iclr-ttt3r,
  title     = {{TTT3R: 3D Reconstruction as Test-Time Training}},
  author    = {Chen, Xingyu and Chen, Yue and Xiu, Yuliang and Geiger, Andreas and Chen, Anpei},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/chen2026iclr-ttt3r/}
}