Splat Feature Solver
Abstract
Feature lifting has emerged as a crucial component in 3D scene understanding, enabling the attachment of rich image feature descriptors (e.g., DINO, CLIP) onto splat-based 3D representations. The core challenge lies in optimally assigning rich general attributes to 3D primitives while addressing the inconsistency issues from multi-view images. We present a unified, kernel- and feature-agnostic formulation of the feature lifting problem as a sparse linear inverse problem, which can be solved efficiently in closed form. Our approach admits a provable upper bound on the global optimal error under convex losses for delivering high quality lifted features. To address inconsistencies and noise in multi-view observations, we introduce two complementary regularization strategies to stabilize the solution and enhance semantic fidelity. Tikhonov Guidance enforces numerical stability through soft diagonal dominance, while Post-Lifting Aggregation filters noisy inputs via feature clustering. Extensive experiments demonstrate that our approach achieves state-of-the-art performance on open-vocabulary 3D segmentation benchmarks, outperforming training-based, grouping-based, and heuristic-forward baselines while producing the lifted features in minutes. Demo Video, \textbf{Code} and \textbf{demo website} are all inside the supplementary.
Cite
Text
Xiong et al. "Splat Feature Solver." International Conference on Learning Representations, 2026.Markdown
[Xiong et al. "Splat Feature Solver." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/xiong2026iclr-splat/)BibTeX
@inproceedings{xiong2026iclr-splat,
title = {{Splat Feature Solver}},
author = {Xiong, Butian and Liu, Rong and Xu, Kenneth and Chen, Meida and Feng, Andrew},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/xiong2026iclr-splat/}
}