A Markov Decision Process for Variable Selection in Branch & Bound

Abstract

Mixed-Integer Linear Programming (MILP) is a powerful framework used to address a wide range of NP-hard combinatorial optimization problems, often solved by Branch and bound (B&B). A key factor influencing the performance of B&B solvers is the variable selection heuristic governing branching decisions. Recent contributions have sought to adapt reinforcement learning (RL) algorithms to the B&B setting to learn optimal branching policies, through Markov Decision Processes (MDP) inspired formulations, and ad hoc convergence theorems and algorithms. In this work, we introduce BBMDP, a principled vanilla MDP formulation for variable selection in B&B, allowing to leverage a broad range of RL algorithms for the purpose of learning optimal B&B heuristics. Computational experiments validate our model empirically, as our branching agent outperforms prior state-of-the-art RL agents on four standard MILP benchmarks.

Cite

Text

Strang et al. "A Markov Decision Process for Variable Selection in Branch & Bound." Advances in Neural Information Processing Systems, 2025.

Markdown

[Strang et al. "A Markov Decision Process for Variable Selection in Branch & Bound." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/strang2025neurips-markov/)

BibTeX

@inproceedings{strang2025neurips-markov,
  title     = {{A Markov Decision Process for Variable Selection in Branch & Bound}},
  author    = {Strang, Paul and Ales, Zacharie and Bissuel, Côme and Juan, Olivier and Kedad-Sidhoum, Safia and Rachelson, Emmanuel},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/strang2025neurips-markov/}
}