Mol-MoE: Training Preference-Guided Routers for Molecule Generation

Abstract

Recent advances in language models have enabled framing molecule generation as sequence modeling. However, existing approaches often rely on single-objective reinforcement learning, limiting their applicability to real-world drug design, where multiple competing properties must be optimized. Traditional multi-objective reinforcement learning (MORL) methods require costly retraining for each new objective combination, making rapid exploration of trade-offs impractical. To overcome these limitations, we introduce Mol-MoE, a mixture-of-experts (MoE) architecture that enables efficient test-time steering of molecule generation without retraining. Central to our approach is a preference-based router training objective that incentivizes the router to combine experts in a way that aligns with user-specified trade-offs. This provides improved flexibility in exploring the chemical property space at test time, facilitating rapid trade-off exploration. Benchmarking against state-of-the-art methods, we show that Mol-MoE achieves superior sample quality and steerability.

Cite

Text

Calanzone et al. "Mol-MoE: Training Preference-Guided Routers for Molecule Generation." ICLR 2025 Workshops: GEM, 2025.

Markdown

[Calanzone et al. "Mol-MoE: Training Preference-Guided Routers for Molecule Generation." ICLR 2025 Workshops: GEM, 2025.](https://mlanthology.org/iclrw/2025/calanzone2025iclrw-molmoe/)

BibTeX

@inproceedings{calanzone2025iclrw-molmoe,
  title     = {{Mol-MoE: Training Preference-Guided Routers for Molecule Generation}},
  author    = {Calanzone, Diego and D'Oro, Pierluca and Bacon, Pierre-Luc},
  booktitle = {ICLR 2025 Workshops: GEM},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/calanzone2025iclrw-molmoe/}
}