Deep Free-Form Deformation Network for Object-Mask Registration

Abstract

This paper addresses the problem of object-mask registration, which aligns a shape mask to a target object instance. Prior work typically formulate the problem as an object segmentation task with mask prior, which is challenging to solve. In this work, we take a transformation based approach that predicts a 2D non-rigid spatial transform and warps the shape mask onto the target object. In particular, we propose a deep spatial transformer network that learns free-form deformations (FFDs) to non-rigidly warp the shape mask based on a multi-level dual mask feature pooling strategy. The FFD transforms are based on B-splines and parameterized by the offsets of predefined control points, which are differentiable. Therefore, we are able to train the entire network in an end-to-end manner based on L2 matching loss. We evaluate our FFD network on a challenging object-mask alignment task, which aims to refine a set of object segment proposals, and our approach achieves the state-of-the-art performance on the Cityscapes, the PASCAL VOC and the MSCOCO datasets.

Cite

Text

Zhang and He. "Deep Free-Form Deformation Network for Object-Mask Registration." International Conference on Computer Vision, 2017. doi:10.1109/ICCV.2017.456

Markdown

[Zhang and He. "Deep Free-Form Deformation Network for Object-Mask Registration." International Conference on Computer Vision, 2017.](https://mlanthology.org/iccv/2017/zhang2017iccv-deep/) doi:10.1109/ICCV.2017.456

BibTeX

@inproceedings{zhang2017iccv-deep,
  title     = {{Deep Free-Form Deformation Network for Object-Mask Registration}},
  author    = {Zhang, Haoyang and He, Xuming},
  booktitle = {International Conference on Computer Vision},
  year      = {2017},
  doi       = {10.1109/ICCV.2017.456},
  url       = {https://mlanthology.org/iccv/2017/zhang2017iccv-deep/}
}