Towards Automated Transcription of Label Text from Pinned Insect Collections

Abstract

We present a computer vision system that can transcribe the text on tiny printed labels stacked beneath pinned insects (as found in museum collections). The approach uses multiple views of each label because the labels are often occluded by the pin, the insect specimen, and other labels. Our approach handles occlusion and the extreme viewing angles required to image the stacked labels. Automated image analysis identifies the lines of text and then aligns and rectifies the images. Combining the aligned and rectified images from multiple viewpoints enables us to create a composite image that can be read using optical character recognition tools (OCR) to extract the text. We provide experimental demonstration using both museum specimens and experimental test labels.

Cite

Text

Agarwal et al. "Towards Automated Transcription of Label Text from Pinned Insect Collections." IEEE/CVF Winter Conference on Applications of Computer Vision, 2018. doi:10.1109/WACV.2018.00027

Markdown

[Agarwal et al. "Towards Automated Transcription of Label Text from Pinned Insect Collections." IEEE/CVF Winter Conference on Applications of Computer Vision, 2018.](https://mlanthology.org/wacv/2018/agarwal2018wacv-automated/) doi:10.1109/WACV.2018.00027

BibTeX

@inproceedings{agarwal2018wacv-automated,
  title     = {{Towards Automated Transcription of Label Text from Pinned Insect Collections}},
  author    = {Agarwal, Nitin and Ferrier, Nicola J. and Hereld, Mark},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2018},
  pages     = {189-198},
  doi       = {10.1109/WACV.2018.00027},
  url       = {https://mlanthology.org/wacv/2018/agarwal2018wacv-automated/}
}