Offline Reinforcement Learning with Behavioral Supervisor Tuning

Abstract

In Partial Multi-Label Learning (PML), each instance is associated with a candidate label set containing multiple relevant labels along with other false positive labels. Currently, most PML methods directly extract instance correlation from instance features while ignoring the candidate labels, which may contain more discriminative instance-related information. This paper argues that, with a well-designed model, more accurate instance correlation can be mined from the candidate labels to facilitate label disambiguation. To this end, we propose a novel PML method based on pseudo-label reconstruction (PML-PLR). Specifically, we first propose a novel orthogonal candidate label reconstruction method, which jointly optimizes with instance features to extract more consistent instance correlation. Then, we use instance correlation as reconstruction coefficient to reconstruct pseudo-labels. Subsequently, through local manifold learning, the reconstructed pseudo-labels are leveraged to propagate the consistency relationship between labels and instances, thereby improving the accuracy of pseudo-labels. Extensive experiments and analyses demonstrate that the proposed PML-PLR outperforms state-of-the-art methods.

Cite

Text

Srinivasan and Knottenbelt. "Offline Reinforcement Learning with Behavioral Supervisor Tuning." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/545

Markdown

[Srinivasan and Knottenbelt. "Offline Reinforcement Learning with Behavioral Supervisor Tuning." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/srinivasan2024ijcai-offline/) doi:10.24963/ijcai.2024/545

BibTeX

@inproceedings{srinivasan2024ijcai-offline,
  title     = {{Offline Reinforcement Learning with Behavioral Supervisor Tuning}},
  author    = {Srinivasan, Padmanaba and Knottenbelt, William J.},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {4929-4937},
  doi       = {10.24963/ijcai.2024/545},
  url       = {https://mlanthology.org/ijcai/2024/srinivasan2024ijcai-offline/}
}