A Single Vector Is Not Enough: Taxonomy Expansion via Box Embeddings (Extended Abstract)
Abstract
The rapid evolution of large language models (LLMs) has revolutionized various fields, including the identification and discovery of human values within text data. While traditional NLP models, such as BERT, have been employed for this task, their ability to represent textual data is significantly outperformed by emerging LLMs like GPTs. However, the performance of online LLMs often degrades when handling long contexts required for value identification, which also incurs substantial computational costs. To address these challenges, we propose EAVIT, an efficient and accurate framework for human value identification that combines the strengths of both locally fine-tunable and online black-box LLMs. Our framework employs a value detector—a small, local language model—to generate initial value estimations. These estimations are then used to construct concise input prompts for online LLMs, enabling accurate final value identification. To train the value detector, we introduce explanation-based training and data generation techniques specifically tailored for value identification, alongside sampling strategies to optimize the brevity of LLM input prompts. Our approach effectively reduces the number of input tokens by up to 1/6 compared to directly querying online LLMs, while consistently outperforming traditional NLP methods and other LLM-based strategies.
Cite
Text
Jiang et al. "A Single Vector Is Not Enough: Taxonomy Expansion via Box Embeddings (Extended Abstract)." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/934Markdown
[Jiang et al. "A Single Vector Is Not Enough: Taxonomy Expansion via Box Embeddings (Extended Abstract)." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/jiang2024ijcai-single/) doi:10.24963/ijcai.2024/934BibTeX
@inproceedings{jiang2024ijcai-single,
title = {{A Single Vector Is Not Enough: Taxonomy Expansion via Box Embeddings (Extended Abstract)}},
author = {Jiang, Song and Yao, Qiyue and Wang, Qifan and Sun, Yizhou},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2024},
pages = {8421-8426},
doi = {10.24963/ijcai.2024/934},
url = {https://mlanthology.org/ijcai/2024/jiang2024ijcai-single/}
}