Memorizing Documents with Guidance in Large Language Models

Abstract

Deep multi-view clustering has attracted increasing attention in the pattern mining of data. However, most of them perform self-learning mechanisms in a single space, ignoring the fruitful structural information hidden in different-level feature spaces. Meanwhile, they conduct the reconstruction constraint to learn generalized representations of samples, failing to explore the discriminative ability of complementary and consistent information. To address the challenges, a multi-granularity invariant structure clustering scheme (MASTER) is proposed to define a bottom-up process that extracts multi-level information in sample, neighborhood, and category granularities from low-level, high-level, and semantics feature space, respectively. Specifically, it leverages the self-learning reconstruction with information-theoretic overclustering to capture invariant sample structure in the low-level feature space. Then, it models data diffusion of the clustering process in the reliable neighborhood to capture invariant local structure in the high-level feature space. Meanwhile, it defines dual divergences induced by the space geometry to capture invariant global structure in the semantics space. Finally, extensive experiments on 8 real-world datasets show that MASTER achieves state-of-the-art performance compared to 11 baselines.

Cite

Text

Park and Choi. "Memorizing Documents with Guidance in Large Language Models." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/714

Markdown

[Park and Choi. "Memorizing Documents with Guidance in Large Language Models." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/park2024ijcai-memorizing/) doi:10.24963/ijcai.2024/714

BibTeX

@inproceedings{park2024ijcai-memorizing,
  title     = {{Memorizing Documents with Guidance in Large Language Models}},
  author    = {Park, Bumjin and Choi, Jaesik},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {6460-6468},
  doi       = {10.24963/ijcai.2024/714},
  url       = {https://mlanthology.org/ijcai/2024/park2024ijcai-memorizing/}
}