When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product

Abstract

State-of-the-art embeddings often capture distinct yet complementary discriminative features: For instance, one image embedding model may excel at distinguishing fine-grained textures, while another focuses on object-level structure. Motivated by this observation, we propose a principled approach to fuse such complementary representations through *kernel multiplication*. Multiplying the kernel similarity functions of two embeddings allows their discriminative structures to interact, producing a fused representation whose kernel encodes the union of the clusters identified by each parent embedding. This formulation also provides a natural way to construct *joint kernels* for paired multi-modal data (e.g., image–text tuples), where the product of modality-specific kernels inherits structure from both domains. We highlight that this kernel product is mathematically realized via the *Kronecker product* of the embedding feature maps, yielding our proposed *KrossFuse* framework for embedding fusion. To address the computational cost of the resulting high-dimensional Kronecker space, we further develop *RP-KrossFuse*, a scalable variant that leverages random projections for efficient approximation. As a key application, we use this framework to bridge the performance gap between cross-modal embeddings (e.g., CLIP, BLIP) and unimodal experts (e.g., DINOv2, E5). Experiments show that RP-KrossFuse effectively integrates these models, enhancing modality-specific performance while preserving cross-modal alignment. The project code is available at https://github.com/yokiwuuu/KrossFuse.

Cite

Text

Wu et al. "When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product." Advances in Neural Information Processing Systems, 2025.

Markdown

[Wu et al. "When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/wu2025neurips-kernels/)

BibTeX

@inproceedings{wu2025neurips-kernels,
  title     = {{When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product}},
  author    = {Wu, Youqi and Zhang, Jingwei and Farnia, Farzan},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/wu2025neurips-kernels/}
}