Scaling Combinatorial Optimization Neural Improvement Heuristics with Online Search and Adaptation

Abstract

We introduce Limited Rollout Beam Search (LRBS), a beam search strategy for deep reinforcement learning (DRL) based combinatorial optimization improvement heuristics. Utilizing pre-trained models on the Euclidean Traveling Salesperson Problem, LRBS significantly enhances both in-distribution performance and generalization to larger problem instances, achieving optimality gaps that outperform existing improvement heuristics and narrowing the gap with state-of-the-art constructive methods. We also extend our analysis to two pickup and delivery TSP variants to validate our results. Finally, we employ our search strategy for offline and online adaptation of the pre-trained improvement policy, leading to improved search performance and surpassing recent adaptive methods for constructive heuristics.

Cite

Text

Verdù et al. "Scaling Combinatorial Optimization Neural Improvement Heuristics with Online Search and Adaptation." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I25.34921

Markdown

[Verdù et al. "Scaling Combinatorial Optimization Neural Improvement Heuristics with Online Search and Adaptation." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/verdu2025aaai-scaling/) doi:10.1609/AAAI.V39I25.34921

BibTeX

@inproceedings{verdu2025aaai-scaling,
  title     = {{Scaling Combinatorial Optimization Neural Improvement Heuristics with Online Search and Adaptation}},
  author    = {Verdù, Federico Julian Camerota and Castelli, Lorenzo and Bortolussi, Luca},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {27135-27143},
  doi       = {10.1609/AAAI.V39I25.34921},
  url       = {https://mlanthology.org/aaai/2025/verdu2025aaai-scaling/}
}