Cycle Conditioning for Robust Representation Learning from Categorical Data
Abstract
This paper introduces a novel diffusion-based method for learning representations from categorical data. Conditional diffusion models have demonstrated their potential to extract meaningful representations from input samples. However, they often struggle to yield versatile, general-purpose information, limiting their adaptability to unforeseen tasks. To address this, we propose a cycle conditioning approach for diffusion models, designed to capture expressive information from conditioning samples. However, cycle conditioning alone can be insufficient. Diffusion models may ignore conditioning samples that vary across training iterations, an issue that occurs within cycle conditioning. To counter this limitation, we introduce additional "spelling" information to guide the conditioning process, ensuring that the conditioning sample remains influential during denoising. While this supervision enhances the generalizability of extracted representations, it is constrained by the sparse nature of spelling information in categorical data, leading to sparse latent conditions. This sparsity reduces the robustness of the extracted representations for downstream tasks or as effective guidance in the diffusion process. To overcome this challenge, we propose a linear navigation strategy within the latent space of conditioning samples, allowing dense representations to be extracted even with sparse supervision. Our experiments demonstrate that our method achieves at least a 1.42\% improvement in AUROC and a 4.12\% improvement in AUCPR over the best results from existing state-of-the-art methods.
Cite
Text
Tabejamaat et al. "Cycle Conditioning for Robust Representation Learning from Categorical Data." Transactions on Machine Learning Research, 2025.Markdown
[Tabejamaat et al. "Cycle Conditioning for Robust Representation Learning from Categorical Data." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/tabejamaat2025tmlr-cycle/)BibTeX
@article{tabejamaat2025tmlr-cycle,
title = {{Cycle Conditioning for Robust Representation Learning from Categorical Data}},
author = {Tabejamaat, Mohsen and Etminani, Farzaneh and Ohlsson, Mattias},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/tabejamaat2025tmlr-cycle/}
}