KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer

Abstract

We introduce knowledge-aware transfer learning with a text-to-text transfer transformer (KAT5) by leveraging a text-to-text transfer transformer (T5) in the Wikipedia domain. In standard transfer learning like T5, a model is first pre-trained on an unsupervised data task with a language model objective before fine-tuning it on a downstream task. T5 explores several learning objectives, including masked language model (MLM), random span, and deshuffling, where the model is limited to exploring integrating knowledge during pre-training. Here, we push the limits of this model by grafting knowledge like entity and co-reference information by mapping Wikipedia and Wikidata during pre-training. We align large-scale alignments between Wikipedia abstract and Wikidata triples to facilitate our pre-training KAT5 model. Our approach can match or outperform task-specific models while using the same architecture and hyper-parameters, in particular in entity and relation extraction (CoNLL04, ADE, and NYT datasets), and language generation tasks, including abstractive summarization (XSum, CNNDM), and machine translation. Our code is publicly released on GitHub ( https://github.com/aistairc/kat5 ) under the Apache 2.0 License.

Cite

Text

Sohrab and Miwa. "KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2024. doi:10.1007/978-3-031-70378-2_10

Markdown

[Sohrab and Miwa. "KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2024.](https://mlanthology.org/ecmlpkdd/2024/sohrab2024ecmlpkdd-kat5/) doi:10.1007/978-3-031-70378-2_10

BibTeX

@inproceedings{sohrab2024ecmlpkdd-kat5,
  title     = {{KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer}},
  author    = {Sohrab, Mohammad Golam and Miwa, Makoto},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2024},
  pages     = {157-173},
  doi       = {10.1007/978-3-031-70378-2_10},
  url       = {https://mlanthology.org/ecmlpkdd/2024/sohrab2024ecmlpkdd-kat5/}
}