Multi-Task Learning via Scale Aware Feature Pyramid Networks and Effective Joint Head

Abstract

As a concise and classic framework for object detection and instance segmentation, Mask R-CNN achieves promising performance in both two tasks.However, considering stronger feature representation for Mask R-CNN fashion framework, there is room for improvement from two aspects. On the one hand, performing multi-task prediction needs more credible feature extraction and multi-scale features integration to handle objects with varied scales. In this paper, we address this problem by using a novel neck module called SA-FPN (Scale Aware Feature Pyramid Networks). With the enhanced feature representations, our model can accurately detect and segment the objects of multiple scales. On the other hand, in Mask R-CNN framework, isolation between parallel detection branch and instance segmentation branch exists, causing the gap between training and testing processes. To narrow this gap, we propose a unified head module named EJ-Head (Effective Joint Head) to combine two branches into one head, not only realizing the interaction between two tasks, but also enhancing the effectiveness of multi-task learning. Comprehensive experiments show that our proposed methods bring noticeable gains for object detection and instance segmentation. In particular, our model outperforms the original Mask R-CNN by 1~2 percent AP in both object detection and instance segmentation task on MS-COCO benchmark. Code will be available soon.

Cite

Text

Ni and Yao. "Multi-Task Learning via Scale Aware Feature Pyramid Networks and Effective Joint Head." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00525

Markdown

[Ni and Yao. "Multi-Task Learning via Scale Aware Feature Pyramid Networks and Effective Joint Head." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/ni2019iccvw-multitask/) doi:10.1109/ICCVW.2019.00525

BibTeX

@inproceedings{ni2019iccvw-multitask,
  title     = {{Multi-Task Learning via Scale Aware Feature Pyramid Networks and Effective Joint Head}},
  author    = {Ni, Feng and Yao, Yuehan},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2019},
  pages     = {4265-4272},
  doi       = {10.1109/ICCVW.2019.00525},
  url       = {https://mlanthology.org/iccvw/2019/ni2019iccvw-multitask/}
}