Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning

Abstract

Effective visual representation learning is crucial for reinforcement learning (RL) agents to extract task-relevant information from raw sensory inputs and generalize across diverse environments. However, existing RL benchmarks lack the ability to systematically evaluate representation learning capabilities in isolation from other learning challenges. To address this gap, we introduce the Sliding Puzzles Gym (SPGym), a novel benchmark that transforms the classic 8-tile puzzle into a visual RL task with images drawn from arbitrarily large datasets. SPGym’s key innovation lies in its ability to precisely control representation learning complexity through adjustable grid sizes and image pools, while maintaining fixed environment dynamics, observation, and action spaces. This design enables researchers to isolate and scale the visual representation challenge independently of other learning components. Through extensive experiments with model-free and model-based RL algorithms, we uncover fundamental limitations in current methods’ ability to handle visual diversity. As we increase the pool of possible images, all algorithms exhibit in- and out-of-distribution performance degradation, with sophisticated representation learning techniques often underperforming simpler approaches like data augmentation. These findings highlight critical gaps in visual representation learning for RL and establish SPGym as a valuable tool for driving progress in robust, generalizable decision-making systems.

Cite

Text

De Oliveira et al. "Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[De Oliveira et al. "Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/deoliveira2025icml-sliding/)

BibTeX

@inproceedings{deoliveira2025icml-sliding,
  title     = {{Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning}},
  author    = {De Oliveira, Bryan Lincoln Marques and Martins, Luana Guedes Barros and Brandão, Bruno and Da Luz, Murilo Lopes and De Lima Soares, Telma Woerle and Carvalho Melo, Luckeciano},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {12689-12717},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/deoliveira2025icml-sliding/}
}