Hierarchical Text Classification as Sub-Hierarchy Sequence Generation
Abstract
Hierarchical text classification (HTC) is essential for various real applications. However, HTC models are challenging to develop because they often require processing a large volume of documents and labels with hierarchical taxonomy. Recent HTC models based on deep learning have attempted to incorporate hierarchy information into a model structure. Consequently, these models are challenging to implement when the model parameters increase for a large-scale hierarchy because the model structure depends on the hierarchy size. To solve this problem, we formulate HTC as a sub-hierarchy sequence generation to incorporate hierarchy information into a target label sequence instead of the model structure. Subsequently, we propose the Hierarchy DECoder (HiDEC), which decodes a text sequence into a sub-hierarchy sequence using recursive hierarchy decoding, classifying all parents at the same level into children at once. In addition, HiDEC is trained to use hierarchical path information from a root to each leaf in a sub-hierarchy composed of the labels of a target document via an attention mechanism and hierarchy-aware masking. HiDEC achieved state-of-the-art performance with significantly fewer model parameters than existing models on benchmark datasets, such as RCV1-v2, NYT, and EURLEX57K.
Cite
Text
Im et al. "Hierarchical Text Classification as Sub-Hierarchy Sequence Generation." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I11.26520Markdown
[Im et al. "Hierarchical Text Classification as Sub-Hierarchy Sequence Generation." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/im2023aaai-hierarchical/) doi:10.1609/AAAI.V37I11.26520BibTeX
@inproceedings{im2023aaai-hierarchical,
title = {{Hierarchical Text Classification as Sub-Hierarchy Sequence Generation}},
author = {Im, Sanghun and Kim, Gibaeg and Oh, Heung-Seon and Jo, Seongung and Kim, Donghwan},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2023},
pages = {12933-12941},
doi = {10.1609/AAAI.V37I11.26520},
url = {https://mlanthology.org/aaai/2023/im2023aaai-hierarchical/}
}