Towards One-Shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case

Abstract

One-shot non-autoregressive neural networks, different from RL-based ones, have been actively adopted for solving combinatorial optimization (CO) problems, which can be trained by the objective score in a self-supervised manner. Such methods have shown their superiority in efficiency (e.g. by parallelization) and potential for tackling predictive CO problems for decision-making under uncertainty. While the discrete constraints often become a bottleneck for gradient-based neural solvers, as currently handled in three typical ways: 1) adding a soft penalty in the objective, where a bounded violation of the constraints cannot be guaranteed, being critical to many constraint-sensitive scenarios; 2) perturbing the input to generate an approximate gradient in a black-box manner, though the constraints are exactly obeyed while the approximate gradients can hurt the performance on the objective score; 3) a compromise by developing soft algorithms whereby the output of neural networks obeys a relaxed constraint, and there can still occur an arbitrary degree of constraint-violation. Towards the ultimate goal of establishing a general framework for neural CO solver with the ability to control an arbitrary-small degree of constraint violation, in this paper, we focus on a more achievable and common setting: the cardinality constraints, which in fact can be readily encoded by a differentiable optimal transport (OT) layer. Based on this observation, we propose OT-based cardinality constraint encoding for end-to-end CO problem learning with two variants: Sinkhorn and Gumbel-Sinkhorn, whereby their violation of the constraints can be exactly characterized and bounded by our theoretical results. On synthetic and real-world CO problem instances, our methods surpass the state-of-the-art CO network and are comparable to (if not superior to) the commercial solver Gurobi. In particular, we further showcase a case study of applying our approach to the predictive portfolio optimization task on real-world asset price data, improving the Sharpe ratio from 1.1 to 2.0 of a strong LSTM+Gurobi baseline under the classic predict-then-optimize paradigm.

Cite

Text

Wang et al. "Towards One-Shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case." International Conference on Learning Representations, 2023.

Markdown

[Wang et al. "Towards One-Shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/wang2023iclr-oneshot/)

BibTeX

@inproceedings{wang2023iclr-oneshot,
  title     = {{Towards One-Shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case}},
  author    = {Wang, Runzhong and Shen, Li and Chen, Yiting and Yang, Xiaokang and Tao, Dacheng and Yan, Junchi},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/wang2023iclr-oneshot/}
}