TrojanTO: Action-Level Backdoor Attacks Against Trajectory Optimization Models

Dai, Yang; Ma, Oubo; Liang, Xingxing; Zhang, Longfei; Cao, Xiaochun; Ji, Shouling; Zhang, Jiaheng; Huang, Jincai; Shen, Li

TrojanTO: Action-Level Backdoor Attacks Against Trajectory Optimization Models

Yang Dai, Oubo Ma, Xingxing Liang, Longfei Zhang, Xiaochun Cao, Shouling Ji, Jiaheng Zhang, Jincai Huang, Li Shen

ICLR 2026

/iclr/2026/dai2026iclr-trojanto/

Abstract

Trajectory Optimization (TO) models have achieved remarkable success in offline reinforcement learning (offline RL). However, their vulnerability to backdoor attacks remains largely unexplored. We find that existing backdoor attacks in RL, which typically rely on reward manipulation throughout training, are largely ineffective against TO models due to their inherent sequence modeling nature and large network size. Moreover, the complexities introduced by high-dimensional continuous action further compound the challenge of injecting effective backdoors. To address these gaps, we propose TrojanTO, the first action-level backdoor attack against TO models. TrojanTO is a post-training attack and employs alternating training to forge a strong connection between triggers and target actions, ensuring high attack effectiveness. To maintain attack stealthiness, it utilizes trajectory filtering to preserve the benign performance and batch poisoning for trigger consistency. Extensive evaluations demonstrate that TrojanTO effectively implants backdoors across diverse tasks and attack objectives with a low attack budget (0.3\% of trajectories). Furthermore, TrojanTO exhibits broad applicability to DT, GDT, and DC, underscoring its scalability across diverse TO model architectures.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Dai et al. "TrojanTO: Action-Level Backdoor Attacks Against Trajectory Optimization Models." International Conference on Learning Representations, 2026.

Markdown

[Dai et al. "TrojanTO: Action-Level Backdoor Attacks Against Trajectory Optimization Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/dai2026iclr-trojanto/)

BibTeX

@inproceedings{dai2026iclr-trojanto,
  title     = {{TrojanTO: Action-Level Backdoor Attacks Against Trajectory Optimization Models}},
  author    = {Dai, Yang and Ma, Oubo and Liang, Xingxing and Zhang, Longfei and Cao, Xiaochun and Ji, Shouling and Zhang, Jiaheng and Huang, Jincai and Shen, Li},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/dai2026iclr-trojanto/}
}