Enhancing Visual Domain Robustness in Behaviour Cloning via Saliency-Guided Augmentation

Abstract

In vision-based behaviour cloning (BC), traditional image-level augmentation methods such as pixel shifting enhance in-domain performance but often struggle with visual domain shifts, including distractors, occlusion, and changes in lighting and backgrounds. Conversely, superimposition-based augmentation, proven effective in computer vision, improves model generalisability by blending training images and out-of-domain images. Despite its potential, the applicability of these methods to vision-based BC remains unclear due to the unique challenges posed by BC demonstrations; specifically, preserving task-critical scene semantics, spatial-temporal relationships, and agent-target interactions is crucial. To address this, we introduce RoboSaGA, a context-aware approach that dynamically adjusts augmentation intensity per pixel based on input saliency derived from the policy. This method ensures aggressive augmentation within task-trivial areas without compromising task-critical information. Furthermore, RoboSaGA seamlessly integrates into existing network architectures without the need for structural changes or additional learning objectives. Our empirical evaluations across both simulated and real-world settings demonstrate that RoboSaGA not only maintains in-domain performance but significantly improves resilience to distractors and background variations.

Cite

Text

Zhuang et al. "Enhancing Visual Domain Robustness in Behaviour Cloning via Saliency-Guided Augmentation." Proceedings of The 8th Conference on Robot Learning, 2024.

Markdown

[Zhuang et al. "Enhancing Visual Domain Robustness in Behaviour Cloning via Saliency-Guided Augmentation." Proceedings of The 8th Conference on Robot Learning, 2024.](https://mlanthology.org/corl/2024/zhuang2024corl-enhancing/)

BibTeX

@inproceedings{zhuang2024corl-enhancing,
  title     = {{Enhancing Visual Domain Robustness in Behaviour Cloning via Saliency-Guided Augmentation}},
  author    = {Zhuang, Zheyu and Wang, Ruiyu and Ingelhag, Nils and Kyrki, Ville and Kragic, Danica},
  booktitle = {Proceedings of The 8th Conference on Robot Learning},
  year      = {2024},
  pages     = {4314-4331},
  volume    = {270},
  url       = {https://mlanthology.org/corl/2024/zhuang2024corl-enhancing/}
}