Multinoulli Extension: A Lossless yet Effective Probabilistic Framework for Subset Selection over Partition Constraints

Abstract

Identifying the most representative subset for a close-to-submodular objective while satisfying the predefined partition constraint is a fundamental task with numerous applications in machine learning. However, the existing distorted local-search methods are often hindered by their prohibitive query complexities and the rigid requirement for prior knowledge of difficult-to-obtain structural parameters. To overcome these limitations, we introduce a novel algorithm titled Multinoulli-SCG, which not only is parameter-free, but also can achieve the same approximation guarantees as the distorted local-search methods with significantly fewer function evaluations. The core of our Multinoulli-SCG algorithm is an innovative continuous-relaxation framework named Multinoulli Extension(ME), which can effectively convert the discrete subset selection problem subject to partition constraints into a solvable continuous maximization focused on learning the optimal multinoulli priors across the considered partition. In sharp contrast with the well-established multi-linear extension for submodular subset selection, a notable advantage of our proposed ME is its intrinsic capacity to provide a lossless rounding scheme for any set function. Finally, we validate the practical efficacy of our proposed algorithms by applying them to video summarization, bayesian A-optimal design and coverage maximization.

Cite

Text

Zhang et al. "Multinoulli Extension: A Lossless yet Effective Probabilistic Framework for Subset Selection over Partition Constraints." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Zhang et al. "Multinoulli Extension: A Lossless yet Effective Probabilistic Framework for Subset Selection over Partition Constraints." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/zhang2025icml-multinoulli/)

BibTeX

@inproceedings{zhang2025icml-multinoulli,
  title     = {{Multinoulli Extension: A Lossless yet Effective Probabilistic Framework for Subset Selection over Partition Constraints}},
  author    = {Zhang, Qixin and Huang, Wei and Jin, Can and Zhao, Puning and Shu, Yao and Shen, Li and Tao, Dacheng},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {75030-75066},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/zhang2025icml-multinoulli/}
}