Preference-Based Deep Reinforcement Learning for Historical Route Estimation

Abstract

Recent Deep Reinforcement Learning (DRL) techniques have advanced solutions to Vehicle Routing Problems (VRPs). However, many of these methods focus exclusively on optimizing distance-oriented objectives (i.e., minimizing route length), often overlooking the implicit drivers' preferences for routes. These preferences, which are crucial in practice, are challenging to model using traditional DRL approaches. To address this gap, we propose a preference-based DRL method characterized by its reward design and optimization objective, which is specialized to learn historical route preferences. Our experiments demonstrate that the method aligns generated solutions more closely with human preferences. Moreover, it exhibits strong generalization performance across a variety of instances, offering a robust solution for different VRP scenarios.

Cite

Text

Pan et al. "Preference-Based Deep Reinforcement Learning for Historical Route Estimation." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/955

Markdown

[Pan et al. "Preference-Based Deep Reinforcement Learning for Historical Route Estimation." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/pan2025ijcai-preference/) doi:10.24963/IJCAI.2025/955

BibTeX

@inproceedings{pan2025ijcai-preference,
  title     = {{Preference-Based Deep Reinforcement Learning for Historical Route Estimation}},
  author    = {Pan, Boshen and Wu, Yaoxin and Cao, Zhiguang and Hou, Yaqing and Zou, Guangyu and Zhang, Qiang},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {8591-8599},
  doi       = {10.24963/IJCAI.2025/955},
  url       = {https://mlanthology.org/ijcai/2025/pan2025ijcai-preference/}
}