Branch Ranking for Efficient Mixed-Integer Programming via Offline Ranking-Based Policy Learning

Abstract

Deriving a good variable selection strategy in branch-and-bound is essential for the efficiency of modern mixed-integer programming (MIP) solvers. With MIP branching data collected during the previous solution process, learning to branch methods have recently become superior over heuristics. As branch-and-bound is naturally a sequential decision making task, one should learn to optimize the utility of the whole MIP solving process instead of being myopic on each step. In this work, we formulate learning to branch as an offline reinforcement learning (RL) problem, and propose a long-sighted hybrid search scheme to construct the offline MIP dataset, which values the long-term utilities of branching decisions. During the policy training phase, we deploy a ranking-based reward assignment scheme to distinguish the promising samples from the long-term or short-term view, and train the branching model named Branch Ranking via offline policy learning. Experiments on synthetic MIP benchmarks and real-world tasks demonstrate that Branch Ranking is more efficient and robust, and can better generalize to large scales of MIP instances compared to the widely used heuristics and state-of-the-art learning-based branching models.

Cite

Text

Huang et al. "Branch Ranking for Efficient Mixed-Integer Programming via Offline Ranking-Based Policy Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022. doi:10.1007/978-3-031-26419-1_23

Markdown

[Huang et al. "Branch Ranking for Efficient Mixed-Integer Programming via Offline Ranking-Based Policy Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022.](https://mlanthology.org/ecmlpkdd/2022/huang2022ecmlpkdd-branch/) doi:10.1007/978-3-031-26419-1_23

BibTeX

@inproceedings{huang2022ecmlpkdd-branch,
  title     = {{Branch Ranking for Efficient Mixed-Integer Programming via Offline Ranking-Based Policy Learning}},
  author    = {Huang, Zeren and Chen, Wenhao and Zhang, Weinan and Shi, Chuhan and Liu, Furui and Zhen, Hui-Ling and Yuan, Mingxuan and Hao, Jianye and Yu, Yong and Wang, Jun},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2022},
  pages     = {377-392},
  doi       = {10.1007/978-3-031-26419-1_23},
  url       = {https://mlanthology.org/ecmlpkdd/2022/huang2022ecmlpkdd-branch/}
}