Splat Feature Solver

Abstract

Feature lifting has emerged as a crucial component in 3D scene understanding, enabling the attachment of rich image feature descriptors (e.g., DINO, CLIP) onto splat-based 3D representations. The core challenge lies in optimally assigning rich general attributes to 3D primitives while addressing the inconsistency issues from multi-view images. We present a unified, kernel- and feature-agnostic formulation of the feature lifting problem as a sparse linear inverse problem, which can be solved efficiently in closed form. Our approach admits a provable upper bound on the global optimal error under convex losses for delivering high quality lifted features. To address inconsistencies and noise in multi-view observations, we introduce two complementary regularization strategies to stabilize the solution and enhance semantic fidelity. Tikhonov Guidance enforces numerical stability through soft diagonal dominance, while Post-Lifting Aggregation filters noisy inputs via feature clustering. Extensive experiments demonstrate that our approach achieves state-of-the-art performance on open-vocabulary 3D segmentation benchmarks, outperforming training-based, grouping-based, and heuristic-forward baselines while producing the lifted features in minutes. Demo Video, \textbf{Code} and \textbf{demo website} are all inside the supplementary.

Cite

Text

Xiong et al. "Splat Feature Solver." International Conference on Learning Representations, 2026.

Markdown

[Xiong et al. "Splat Feature Solver." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/xiong2026iclr-splat/)

BibTeX

@inproceedings{xiong2026iclr-splat,
  title     = {{Splat Feature Solver}},
  author    = {Xiong, Butian and Liu, Rong and Xu, Kenneth and Chen, Meida and Feng, Andrew},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/xiong2026iclr-splat/}
}