TAP: The Attention Patch for Cross-Modal Knowledge Transfer from Unlabeled Modality
Abstract
This paper addresses a cross-modal learning framework, where the objective is to enhance the performance of supervised learning in the primary modality using an unlabeled, unpaired secondary modality. Taking a probabilistic approach for missing information estimation, we show that the extra information contained in the secondary modality can be estimated via Nadaraya-Watson (NW) kernel regression, which can further be expressed as a kernelized cross-attention module (under linear transformation). This expression lays the foundation for introducing The Attention Patch (TAP), a simple neural network add-on that can be trained to allow data-level knowledge transfer from the unlabeled modality. We provide extensive numerical simulations using real-world datasets to show that TAP can provide statistically significant improvement in generalization across different domains and different neural network architectures, making use of seemingly unusable unlabeled cross-modal data.
Cite
Text
Wang and Shahrampour. "TAP: The Attention Patch for Cross-Modal Knowledge Transfer from Unlabeled Modality." Transactions on Machine Learning Research, 2024.Markdown
[Wang and Shahrampour. "TAP: The Attention Patch for Cross-Modal Knowledge Transfer from Unlabeled Modality." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/wang2024tmlr-tap/)BibTeX
@article{wang2024tmlr-tap,
title = {{TAP: The Attention Patch for Cross-Modal Knowledge Transfer from Unlabeled Modality}},
author = {Wang, Yinsong and Shahrampour, Shahin},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/wang2024tmlr-tap/}
}