PALM: Probabilistic Area Loss Minimization for Protein Sequence Alignment
Abstract
Protein sequence alignment is a fundamental problem in computational structure biology and popular for protein 3D structural prediction and protein homology detection. Most of the developed programs for detecting protein sequence alignments are based upon the likelihood information of amino acids and are sensitive to alignment noises. We present a novel method PALM for modeling pairwise protein structure alignments, using the area distance to reduce the biological measurement noise. PALM generatively learn the alignment of two protein sequences with probabilistic area distance objective, which can denoise the measurement errors contained in the ground-truth alignments. During learning, we show that the optimization is computationally efficient by estimating the gradients via dynamically sampling alignments. Empirically, we show that PALM can generate sequence alignments with higher precision and recall, as well as smaller area distance than the competing methods especially for long protein sequences and remote homologies. This study implies for learning over large-scale protein sequence alignment problems, one could potentially give PALM a try.
Cite
Text
Ding et al. "PALM: Probabilistic Area Loss Minimization for Protein Sequence Alignment." Uncertainty in Artificial Intelligence, 2021.Markdown
[Ding et al. "PALM: Probabilistic Area Loss Minimization for Protein Sequence Alignment." Uncertainty in Artificial Intelligence, 2021.](https://mlanthology.org/uai/2021/ding2021uai-palm/)BibTeX
@inproceedings{ding2021uai-palm,
title = {{PALM: Probabilistic Area Loss Minimization for Protein Sequence Alignment}},
author = {Ding, Fan and Jiang, Nan and Ma, Jianzhu and Peng, Jian and Xu, Jinbo and Xue, Yexiang},
booktitle = {Uncertainty in Artificial Intelligence},
year = {2021},
pages = {1100-1109},
volume = {161},
url = {https://mlanthology.org/uai/2021/ding2021uai-palm/}
}