From End-to-End to Step-by-Step: Learning to Abstract via Abductive Reinforcement Learning

Abstract

Abstraction is a critical technique in general problem-solving, allowing complex tasks to be decomposed into smaller, manageable sub-tasks. While traditional symbolic planning relies on predefined primitive symbols to construct structured abstractions, its reliance on formal representations limits applicability to real-world tasks. On the other hand, reinforcement learning excels at learning end-to-end policies directly from sensory inputs in unstructured environments but struggles with compositional generalization in complex tasks with delayed rewards. In this paper, we propose Abductive Abstract Reinforcement Learning (A2RL), a novel neuro-symbolic RL framework bridging the two paradigms based on Abductive Learning (ABL), enabling RL agents to learn abstractions directly from raw sensory inputs without predefined symbols. A2RL induces a finite state machine to represent high-level, step-by-step procedures, where each abstract state corresponds to a sub-algebra of the original Markov Decision Process (MDP). This approach not only bridges the gap between symbolic abstraction and sub-symbolic learning but also provides a natural mechanism for the emergence of new symbols. Experiments show that A2RL can mitigate the delayed reward problem and improve the generalization capability compared to traditional end-to-end RL methods.

Cite

Text

Wang et al. "From End-to-End to Step-by-Step: Learning to Abstract via Abductive Reinforcement Learning." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/725

Markdown

[Wang et al. "From End-to-End to Step-by-Step: Learning to Abstract via Abductive Reinforcement Learning." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/wang2025ijcai-end/) doi:10.24963/IJCAI.2025/725

BibTeX

@inproceedings{wang2025ijcai-end,
  title     = {{From End-to-End to Step-by-Step: Learning to Abstract via Abductive Reinforcement Learning}},
  author    = {Wang, Zilong and Wang, Jiongda and Chen, Xiaoyong and Wang, Meng and Ma, Ming and Wang, ZhiPeng and Zhou, Zhenyu and Yang, Tianming and Dai, Wang-Zhou},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {6515-6523},
  doi       = {10.24963/IJCAI.2025/725},
  url       = {https://mlanthology.org/ijcai/2025/wang2025ijcai-end/}
}