Enhancing Molecular Conformer Generation via Fragment- Augmented Diffusion Pretraining
Abstract
Recent advances in diffusion-based methods have shown promising results for molecular conformer generation, yet their performance remains constrained by training data scarcity---particularly for structurally complex molecules. In this work, we present Fragment-Augmented Diffusion (FragDiff), a data-centric augmentation strategy that incorporates chemical fragmentation techniques into the pre-training phase of modern diffusion-based generative models. Our key innovation lies in decomposing molecules into chemically meaningful fragments that serve as building blocks for systematic data augmentation, enabling the diffusion model to learn enhanced local geometry while maintaining global molecular topology. Unlike existing approaches that focus on complex architectural modifications, FragDiff adopts a data-centric paradigm orthogonal to model design. Comprehensive benchmarks show FragDiff's superior performance, especially in data-scarce scenarios. Notably, it achieves 12.2--13.4% performance improvement on molecules 3$\times$ beyond training scale through pretraining on fragments. Overall, we establish a new paradigm integrating chemical fragmentations with diffusion models, advancing computational chemistry workflows. The code is available at https://github.com/ShawnKS/fragdiff.
Cite
Text
Song et al. "Enhancing Molecular Conformer Generation via Fragment- Augmented Diffusion Pretraining." Transactions on Machine Learning Research, 2025.Markdown
[Song et al. "Enhancing Molecular Conformer Generation via Fragment- Augmented Diffusion Pretraining." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/song2025tmlr-enhancing/)BibTeX
@article{song2025tmlr-enhancing,
title = {{Enhancing Molecular Conformer Generation via Fragment- Augmented Diffusion Pretraining}},
author = {Song, Xiaozhuang and Tu, Yuzhao and Yu, Tianshu},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/song2025tmlr-enhancing/}
}