Multi-Patch Learning: Looking More Pixels in the Training Phase

Abstract

Due to the limitations of computation capability and memory of GPUs, most image restoration tasks are trained with cropped patches instead of full-size images. Existing extensive experiments show that the model trained with a larger patch size could achieve better performance since a larger patch size typically means larger receptive fields. However, it comes at the cost of extremely long training times and significant memory consumption. To alleviate the dilemma mentioned above, we propose a multi-patch method to expand the receptive field with negligible memory and computation increase (less than $1\%$ 1 % ). In addition, we collect 100K high-quality images of 1K categories, following ImageNet, from flickr.com for low-level image tasks. Our method improves the quantitative performance by 0.3412dB on the validation set of the “Compressed Input Super-Resolution Challenge - Image Track”.

Cite

Text

Li et al. "Multi-Patch Learning: Looking More Pixels in the Training Phase." European Conference on Computer Vision Workshops, 2022. doi:10.1007/978-3-031-25063-7_34

Markdown

[Li et al. "Multi-Patch Learning: Looking More Pixels in the Training Phase." European Conference on Computer Vision Workshops, 2022.](https://mlanthology.org/eccvw/2022/li2022eccvw-multipatch/) doi:10.1007/978-3-031-25063-7_34

BibTeX

@inproceedings{li2022eccvw-multipatch,
  title     = {{Multi-Patch Learning: Looking More Pixels in the Training Phase}},
  author    = {Li, Lei and Tang, Jingzhu and Chen, Ming and Zhao, Shijie and Li, Junlin and Zhang, Li},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2022},
  pages     = {549-560},
  doi       = {10.1007/978-3-031-25063-7_34},
  url       = {https://mlanthology.org/eccvw/2022/li2022eccvw-multipatch/}
}