Stylos: Multi-View 3D Stylization with Single-Forward Gaussian Splatting
Abstract
We present Stylos, a single-forward 3D Gaussian framework for 3D style transfer that operates on unposed content, from a single image to a multi-view collection, conditioned on a separate reference style image. Stylos synthesizes a stylized 3D Gaussian scene without per-scene optimization or precomputed poses, achieving geometry-aware, view-consistent stylization that generalizes to unseen categories, scenes, and styles. At its core, Stylos adopts a Transformer backbone with two pathways: geometry predictions retain self-attention to preserve geometric fidelity, while style is injected via cross-attention to enforce visual consistency across views. With the addition of a voxel-based 3D style loss that aligns aggregated scene features to style statistics, Stylos enforces view-consistent stylization while maintaining geometric coherence. Experiments across multiple datasets demonstrate that Stylos delivers high-quality zero-shot stylization, highlighting the effectiveness of the proposed style-content fusion block, the voxel-level style loss, and the scalability of our framework from single view to large-scale multi-view settings. Our codes are available at https://github.com/HanzhouLiu/Stylos.
Cite
Text
Liu et al. "Stylos: Multi-View 3D Stylization with Single-Forward Gaussian Splatting." International Conference on Learning Representations, 2026.Markdown
[Liu et al. "Stylos: Multi-View 3D Stylization with Single-Forward Gaussian Splatting." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/liu2026iclr-stylos/)BibTeX
@inproceedings{liu2026iclr-stylos,
title = {{Stylos: Multi-View 3D Stylization with Single-Forward Gaussian Splatting}},
author = {Liu, Hanzhou and Huang, Jia and Lu, Mi and Saripalli, Srikanth and Jiang, Peng},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/liu2026iclr-stylos/}
}