Semantic Segmentation with Multi Scale Spatial Attention for Self Driving Cars

Abstract

In this paper, we present a novel neural network using multi scale feature fusion at various scales for accurate and efficient semantic image segmentation. We used ResNet based feature extractor, dilated convolutional layers in down-sampling part, atrous convolutional layers in the upsampling part and used concat operation to merge them. A new attention module is proposed to encode more contextual information and enhance the receptive field of the network. We present an in depth theoretical analysis of our network with training and optimization details. Our network was trained and tested on the Camvid dataset and Cityscapes dataset using mean accuracy per class and Intersection Over Union (IOU) as the evaluation metrics. Our model outperforms previous state of the art methods on semantic segmentation achieving mean IOU value of 74.12 while running at >100 FPS.

Cite

Text

Sagar and Soundrapandiyan. "Semantic Segmentation with Multi Scale Spatial Attention for Self Driving Cars." IEEE/CVF International Conference on Computer Vision Workshops, 2021. doi:10.1109/ICCVW54120.2021.00299

Markdown

[Sagar and Soundrapandiyan. "Semantic Segmentation with Multi Scale Spatial Attention for Self Driving Cars." IEEE/CVF International Conference on Computer Vision Workshops, 2021.](https://mlanthology.org/iccvw/2021/sagar2021iccvw-semantic/) doi:10.1109/ICCVW54120.2021.00299

BibTeX

@inproceedings{sagar2021iccvw-semantic,
  title     = {{Semantic Segmentation with Multi Scale Spatial Attention for Self Driving Cars}},
  author    = {Sagar, Abhinav and Soundrapandiyan, Rajkumar},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2021},
  pages     = {2650-2656},
  doi       = {10.1109/ICCVW54120.2021.00299},
  url       = {https://mlanthology.org/iccvw/2021/sagar2021iccvw-semantic/}
}