Multinoulli Extension: A Lossless yet Effective Probabilistic Framework for Subset Selection over Partition Constraints
Abstract
Identifying the most representative subset for a close-to-submodular objective while satisfying the predefined partition constraint is a fundamental task with numerous applications in machine learning. However, the existing distorted local-search methods are often hindered by their prohibitive query complexities and the rigid requirement for prior knowledge of difficult-to-obtain structural parameters. To overcome these limitations, we introduce a novel algorithm titled Multinoulli-SCG, which not only is parameter-free, but also can achieve the same approximation guarantees as the distorted local-search methods with significantly fewer function evaluations. The core of our Multinoulli-SCG algorithm is an innovative continuous-relaxation framework named Multinoulli Extension(ME), which can effectively convert the discrete subset selection problem subject to partition constraints into a solvable continuous maximization focused on learning the optimal multinoulli priors across the considered partition. In sharp contrast with the well-established multi-linear extension for submodular subset selection, a notable advantage of our proposed ME is its intrinsic capacity to provide a lossless rounding scheme for any set function. Finally, we validate the practical efficacy of our proposed algorithms by applying them to video summarization, bayesian A-optimal design and coverage maximization.
Cite
Text
Zhang et al. "Multinoulli Extension: A Lossless yet Effective Probabilistic Framework for Subset Selection over Partition Constraints." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Zhang et al. "Multinoulli Extension: A Lossless yet Effective Probabilistic Framework for Subset Selection over Partition Constraints." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/zhang2025icml-multinoulli/)BibTeX
@inproceedings{zhang2025icml-multinoulli,
title = {{Multinoulli Extension: A Lossless yet Effective Probabilistic Framework for Subset Selection over Partition Constraints}},
author = {Zhang, Qixin and Huang, Wei and Jin, Can and Zhao, Puning and Shu, Yao and Shen, Li and Tao, Dacheng},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {75030-75066},
volume = {267},
url = {https://mlanthology.org/icml/2025/zhang2025icml-multinoulli/}
}