TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection

Abstract

We propose TandA, an effective technique for fine-tuning pre-trained Transformer models for natural language tasks. Specifically, we first transfer a pre-trained model into a model for a general task by fine-tuning it with a large and high-quality dataset. We then perform a second fine-tuning step to adapt the transferred model to the target domain. We demonstrate the benefits of our approach for answer sentence selection, which is a well-known inference task in Question Answering. We built a large scale dataset to enable the transfer step, exploiting the Natural Questions dataset. Our approach establishes the state of the art on two well-known benchmarks, WikiQA and TREC-QA, achieving the impressive MAP scores of 92% and 94.3%, respectively, which largely outperform the the highest scores of 83.4% and 87.5% of previous work. We empirically show that TandA generates more stable and robust models reducing the effort required for selecting optimal hyper-parameters. Additionally, we show that the transfer step of TandA makes the adaptation step more robust to noise. This enables a more effective use of noisy datasets for fine-tuning. Finally, we also confirm the positive impact of TandA in an industrial setting, using domain specific datasets subject to different types of noise.

Cite

Text

Garg et al. "TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I05.6282

Markdown

[Garg et al. "TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/garg2020aaai-tanda/) doi:10.1609/AAAI.V34I05.6282

BibTeX

@inproceedings{garg2020aaai-tanda,
  title     = {{TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection}},
  author    = {Garg, Siddhant and Vu, Thuy and Moschitti, Alessandro},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {7780-7788},
  doi       = {10.1609/AAAI.V34I05.6282},
  url       = {https://mlanthology.org/aaai/2020/garg2020aaai-tanda/}
}