Refining Hybrid Genetic Search for CVRP via Reinforcement Learning-Finetuned LLM

Abstract

While large language models (LLMs) are increasingly used as automated heuristic designers for vehicle routing problems (VRPs), current state-of-the-art methods predominantly rely on prompting massive, general-purpose models like GPT-4. This work challenges that paradigm by demonstrating that a smaller, specialized LLM, when meticulously fine-tuned, can generate components that surpass expert-crafted heuristics within advanced solvers. We propose RFTHGS, a novel Reinforcement learning (RL) framework for Fine-Tuning a compact LLM to generate high-performance crossover operators for the Hybrid Genetic Search (HGS) solver, applied to the Capacitated VRP (CVRP). Our method employs a multi-tiered, curriculum-based reward function that progressively guides the LLM to master generating first compilable, then executable, and finally, superior-performing operators that exceed human expert designs. This is coupled with an operator caching mechanism that discourages plagiarism and promotes diversity during training. Comprehensive experiments show that our fine-tuned LLM produces crossover operators which significantly outperform the expert-designed ones in HGS. The performance advantage remains consistent, generalizing from small-scale instances to large-scale problems with up to 1000 nodes. Furthermore, RFTHGS exceeds the performance of leading neuro-combinatorial baselines, prompt-based methods, and commercial LLMs such as GPT-4o and GPT-4o-mini.

Cite

Text

Zhu et al. "Refining Hybrid Genetic Search for CVRP via Reinforcement Learning-Finetuned LLM." International Conference on Learning Representations, 2026.

Markdown

[Zhu et al. "Refining Hybrid Genetic Search for CVRP via Reinforcement Learning-Finetuned LLM." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zhu2026iclr-refining/)

BibTeX

@inproceedings{zhu2026iclr-refining,
  title     = {{Refining Hybrid Genetic Search for CVRP via Reinforcement Learning-Finetuned LLM}},
  author    = {Zhu, Rongjie and Zhang, Cong and Cao, Zhiguang},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/zhu2026iclr-refining/}
}