Cornerformer: Purifying Instances for Corner-Based Detectors

Abstract

Corner-based object detectors enjoy the potential of detecting arbitrarily-sized instances, yet the performance is mainly harmed by the accuracy of instance construction. Specifically, there are three factors, namely, 1) the corner keypoints are prone to false-positives; 2) incorrect matches emerge upon corner keypoint pull-push embeddings; and 3) the heuristic NMS cannot adjust the corners pull-push mechanism. Accordingly, this paper presents an elegant framework named Cornerformer that is composed of two factors. First, we build a Corner Transformer Encoder (CTE, a self-attention module) in a 2D-form to enhance the information aggregated by corner keypoints, offering stronger features for the pull-push loss to distinguish instances from each other. Second, we design an Attenuation-Auto-Adjusted NMS (A3-NMS) to maximally leverage the semantic outputs and avoid true objects from being removed. Experiments on object detection and human pose estimation show the superior performance of Cornerformer in terms of accuracy and inference speed.

Cite

Text

Wei et al. "Cornerformer: Purifying Instances for Corner-Based Detectors." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-20080-9_2

Markdown

[Wei et al. "Cornerformer: Purifying Instances for Corner-Based Detectors." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/wei2022eccv-cornerformer/) doi:10.1007/978-3-031-20080-9_2

BibTeX

@inproceedings{wei2022eccv-cornerformer,
  title     = {{Cornerformer: Purifying Instances for Corner-Based Detectors}},
  author    = {Wei, Haoran and Chen, Xin and Xie, Lingxi and Tian, Qi},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-20080-9_2},
  url       = {https://mlanthology.org/eccv/2022/wei2022eccv-cornerformer/}
}