CAENet: Efficient Multi-Task Learning for Joint Semantic Segmentation and Depth Estimation

Abstract

In this paper, we propose an efficient multi-task method, named Context-aware Attentive Enrichment Network (CAENet), to deal with the problem of real-time joint semantic segmentation and depth estimation. Building upon a light-weight encoder backbone, an efficient decoder is devised to fully leverage available information from multi-scale encoder features. In particular, a new Inception Residual Pooling (IRP) module is designed to efficiently extract contextual information from the high-level features with diverse receptive fields to improve semantic understanding ability. Then the context-aware features are enriched adaptively with spatial details from low-level features via a Light-weight Attentive Fusion (LAF) module using pseudo stereoscopic attention mechanism. These two modules are progressively used in a recursive manner to generate high-resolution shared features, which are further processed by task-specific heads to produce final outputs. Such network design effectively captures beneficial information for both semantic segmentation and depth estimation tasks while largely reducing the computational budget. Extensive experiments across multi-task benchmarks validate that CAENet achieves state-of-the-art performance with comparable inference speed against other real-time competing methods. Code is available at https://github.com/wlx-zju/CAENet .

Cite

Text

Wang and Li. "CAENet: Efficient Multi-Task Learning for Joint Semantic Segmentation and Depth Estimation." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023. doi:10.1007/978-3-031-43424-2_25

Markdown

[Wang and Li. "CAENet: Efficient Multi-Task Learning for Joint Semantic Segmentation and Depth Estimation." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023.](https://mlanthology.org/ecmlpkdd/2023/wang2023ecmlpkdd-caenet/) doi:10.1007/978-3-031-43424-2_25

BibTeX

@inproceedings{wang2023ecmlpkdd-caenet,
  title     = {{CAENet: Efficient Multi-Task Learning for Joint Semantic Segmentation and Depth Estimation}},
  author    = {Wang, Luxi and Li, Yingming},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2023},
  pages     = {408-425},
  doi       = {10.1007/978-3-031-43424-2_25},
  url       = {https://mlanthology.org/ecmlpkdd/2023/wang2023ecmlpkdd-caenet/}
}