From Prediction to Perfection: Introducing Refinement to Autoregressive Image Generation

Cheng, Cheng; Song, Lin; An, Di; Xiao, Yicheng; Zhang, Xuchong; Sun, Hongbin; Shan, Ying

From Prediction to Perfection: Introducing Refinement to Autoregressive Image Generation

Cheng Cheng, Lin Song, Di An, Yicheng Xiao, Xuchong Zhang, Hongbin Sun, Ying Shan

ICLR 2026

/iclr/2026/cheng2026iclr-prediction/

Abstract

Autoregressive (AR) models have emerged as a powerful framework for image generation, yet they remain bound by a fundamental limitation: once a prediction is made, it cannot be revised. Each step marches forward in a strict left-to-right sequence, causing small errors to accumulate and compromise the final image. In this work, we reimagine this process with TensorAR, a decoder-only AR model that shifts from predicting discrete tokens to predicting overlapping tensor windows. This simple change transforms image synthesis into a process of next-tensor prediction, enabling the model to refine earlier outputs while preserving the causal structure that defines autoregression. To guard against information leakage during training, we introduce a discrete tensor noising mechanism inspired by discrete diffusion theory, which injects categorical noise into input tensors. TensorAR is designed to be plug-and-play: unlike masked AR methods, it requires no architectural modifications, and unlike autoregressive diffusion, it preserves the familiar AR training paradigm. We evaluate TensorAR across both class-to-image and text-to-image tasks, showing consistent gains in generation quality and instruction-following ability, while achieving a superior balance between quality and latency. In doing so, TensorAR offers a new path forward for autoregressive generation---one where predictions are not just produced, but continually refined.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Cheng et al. "From Prediction to Perfection: Introducing Refinement to Autoregressive Image Generation." International Conference on Learning Representations, 2026.

Markdown

[Cheng et al. "From Prediction to Perfection: Introducing Refinement to Autoregressive Image Generation." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/cheng2026iclr-prediction/)

BibTeX

@inproceedings{cheng2026iclr-prediction,
  title     = {{From Prediction to Perfection: Introducing Refinement to Autoregressive Image Generation}},
  author    = {Cheng, Cheng and Song, Lin and An, Di and Xiao, Yicheng and Zhang, Xuchong and Sun, Hongbin and Shan, Ying},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/cheng2026iclr-prediction/}
}