KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer
Abstract
We introduce knowledge-aware transfer learning with a text-to-text transfer transformer (KAT5) by leveraging a text-to-text transfer transformer (T5) in the Wikipedia domain. In standard transfer learning like T5, a model is first pre-trained on an unsupervised data task with a language model objective before fine-tuning it on a downstream task. T5 explores several learning objectives, including masked language model (MLM), random span, and deshuffling, where the model is limited to exploring integrating knowledge during pre-training. Here, we push the limits of this model by grafting knowledge like entity and co-reference information by mapping Wikipedia and Wikidata during pre-training. We align large-scale alignments between Wikipedia abstract and Wikidata triples to facilitate our pre-training KAT5 model. Our approach can match or outperform task-specific models while using the same architecture and hyper-parameters, in particular in entity and relation extraction (CoNLL04, ADE, and NYT datasets), and language generation tasks, including abstractive summarization (XSum, CNNDM), and machine translation. Our code is publicly released on GitHub ( https://github.com/aistairc/kat5 ) under the Apache 2.0 License.
Cite
Text
Sohrab and Miwa. "KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2024. doi:10.1007/978-3-031-70378-2_10Markdown
[Sohrab and Miwa. "KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2024.](https://mlanthology.org/ecmlpkdd/2024/sohrab2024ecmlpkdd-kat5/) doi:10.1007/978-3-031-70378-2_10BibTeX
@inproceedings{sohrab2024ecmlpkdd-kat5,
title = {{KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer}},
author = {Sohrab, Mohammad Golam and Miwa, Makoto},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2024},
pages = {157-173},
doi = {10.1007/978-3-031-70378-2_10},
url = {https://mlanthology.org/ecmlpkdd/2024/sohrab2024ecmlpkdd-kat5/}
}