Generative Proto-Sequence: Sequence-Level Decision Making for Long-Horizon Reinforcement Learning
Abstract
Deep reinforcement learning (DRL) methods often face challenges in environments characterized by large state spaces, long action horizons, and sparse rewards, where effective exploration and credit assignment are critical. We introduce Generative Proto-Sequence (GPS), a novel generative DRL approach that produces variable-length discrete action sequences. By generating entire action sequences in a single decision rather than selecting individual actions at each timestep, GPS reduces the temporal decision bottleneck that impedes learning in long-horizon tasks. This sequence-level abstraction provides three key advantages: (1) it facilitates more effective credit assignment by directly connecting state observations with the outcomes of complete behavioral patterns; (2) by committing to coherent multi-step strategies, our approach facilitates better exploration of the state space; and (3) it promotes better generalization by learning macro-behaviors that transfer across similar situations rather than memorizing state-specific responses. Evaluations across diverse maze navigation tasks of varying sizes and complexities demonstrate that GPS outperforms leading action repetition and temporal methods in the large majority of tested configurations, where it converges faster and achieves higher success rates.
Cite
Text
Fried et al. "Generative Proto-Sequence: Sequence-Level Decision Making for Long-Horizon Reinforcement Learning." Transactions on Machine Learning Research, 2025.Markdown
[Fried et al. "Generative Proto-Sequence: Sequence-Level Decision Making for Long-Horizon Reinforcement Learning." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/fried2025tmlr-generative/)BibTeX
@article{fried2025tmlr-generative,
title = {{Generative Proto-Sequence: Sequence-Level Decision Making for Long-Horizon Reinforcement Learning}},
author = {Fried, Netanel and Giladi, Liad and Katz, Gilad},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/fried2025tmlr-generative/}
}