MoleCLUEs: Molecular Conformers Maximally In-Distribution for Predictive Models
Abstract
Structure-based molecular ML (SBML) models can be highly sensitive to input geometries and give predictions with large variance. We present an approach to mitigate the challenge of selecting conformations for such models by generating conformers that explicitly minimize predictive uncertainty. To achieve this, we compute estimates of aleatoric and epistemic uncertainties that are differentiable w.r.t. latent posteriors. We then iteratively sample new latents in the direction of lower uncertainty by gradient descent. As we train our predictive models jointly with a conformer decoder, the new latent embeddings can be mapped to their corresponding inputs, which we call MoleCLUEs, or (molecular) counterfactual latent uncertainty explanations (Antorán et al, 2020). We assess our algorithm for the task of predicting drug properties from 3D structure with maximum confidence. We additionally analyze the structure trajectories obtained from conformer optimizations, which provide insight into the sources of uncertainty in SBML.
Cite
Text
Maser et al. "MoleCLUEs: Molecular Conformers Maximally In-Distribution for Predictive Models." NeurIPS 2023 Workshops: AI4Science, 2023.Markdown
[Maser et al. "MoleCLUEs: Molecular Conformers Maximally In-Distribution for Predictive Models." NeurIPS 2023 Workshops: AI4Science, 2023.](https://mlanthology.org/neuripsw/2023/maser2023neuripsw-moleclues/)BibTeX
@inproceedings{maser2023neuripsw-moleclues,
title = {{MoleCLUEs: Molecular Conformers Maximally In-Distribution for Predictive Models}},
author = {Maser, Michael and Tagasovska, Natasa and Lee, Jae Hyeon and Watkins, Andrew Martin},
booktitle = {NeurIPS 2023 Workshops: AI4Science},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/maser2023neuripsw-moleclues/}
}