Physical Consistency Bridges Heterogeneous Data in Molecular Multi-Task Learning

Abstract

In recent years, machine learning has demonstrated impressive capability in handling molecular science tasks. To support various molecular properties at scale, machine learning models are trained in the multi-task learning paradigm. Nevertheless, data of different molecular properties are often not aligned: some quantities, e.g. equilibrium structure, demand more cost to compute than others, e.g. energy, so their data are often generated by cheaper computational methods at the cost of lower accuracy, which cannot be directly overcome through multi-task learning. Moreover, it is not straightforward to leverage abundant data of other tasks to benefit a particular task. To handle such data heterogeneity challenges, we exploit the specialty of molecular tasks that there are physical laws connecting them, and design consistency training approaches that allow different tasks to exchange information directly so as to improve one another. Particularly, we demonstrate that the more accurate energy data can improve the accuracy of structure prediction. We also find that consistency training can directly leverage force and off-equilibrium structure data to improve structure prediction, demonstrating a broad capability for integrating heterogeneous data.

Cite

Text

Ren et al. "Physical Consistency Bridges Heterogeneous Data in Molecular Multi-Task Learning." Neural Information Processing Systems, 2024. doi:10.52202/079017-2330

Markdown

[Ren et al. "Physical Consistency Bridges Heterogeneous Data in Molecular Multi-Task Learning." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/ren2024neurips-physical/) doi:10.52202/079017-2330

BibTeX

@inproceedings{ren2024neurips-physical,
  title     = {{Physical Consistency Bridges Heterogeneous Data in Molecular Multi-Task Learning}},
  author    = {Ren, Yuxuan and Zheng, Dihan and Liu, Chang and Jin, Peiran and Shi, Yu and Huang, Lin and He, Jiyan and Luo, Shengjie and Qin, Tao and Liu, Tie-Yan},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-2330},
  url       = {https://mlanthology.org/neurips/2024/ren2024neurips-physical/}
}