Enhancing Visual Domain Robustness in Behaviour Cloning via Saliency-Guided Augmentation
Abstract
In vision-based behaviour cloning (BC), traditional image-level augmentation methods such as pixel shifting enhance in-domain performance but often struggle with visual domain shifts, including distractors, occlusion, and changes in lighting and backgrounds. Conversely, superimposition-based augmentation, proven effective in computer vision, improves model generalisability by blending training images and out-of-domain images. Despite its potential, the applicability of these methods to vision-based BC remains unclear due to the unique challenges posed by BC demonstrations; specifically, preserving task-critical scene semantics, spatial-temporal relationships, and agent-target interactions is crucial. To address this, we introduce RoboSaGA, a context-aware approach that dynamically adjusts augmentation intensity per pixel based on input saliency derived from the policy. This method ensures aggressive augmentation within task-trivial areas without compromising task-critical information. Furthermore, RoboSaGA seamlessly integrates into existing network architectures without the need for structural changes or additional learning objectives. Our empirical evaluations across both simulated and real-world settings demonstrate that RoboSaGA not only maintains in-domain performance but significantly improves resilience to distractors and background variations.
Cite
Text
Zhuang et al. "Enhancing Visual Domain Robustness in Behaviour Cloning via Saliency-Guided Augmentation." Proceedings of The 8th Conference on Robot Learning, 2024.Markdown
[Zhuang et al. "Enhancing Visual Domain Robustness in Behaviour Cloning via Saliency-Guided Augmentation." Proceedings of The 8th Conference on Robot Learning, 2024.](https://mlanthology.org/corl/2024/zhuang2024corl-enhancing/)BibTeX
@inproceedings{zhuang2024corl-enhancing,
title = {{Enhancing Visual Domain Robustness in Behaviour Cloning via Saliency-Guided Augmentation}},
author = {Zhuang, Zheyu and Wang, Ruiyu and Ingelhag, Nils and Kyrki, Ville and Kragic, Danica},
booktitle = {Proceedings of The 8th Conference on Robot Learning},
year = {2024},
pages = {4314-4331},
volume = {270},
url = {https://mlanthology.org/corl/2024/zhuang2024corl-enhancing/}
}