Enhanced Optical Character Recognition by Optical Sensor Combined with BERT and Cosine Similarity Scoring (Student Abstract)

Abstract

Optical character recognition(OCR) is the technology to identify text characters embedded within images. Conventional OCR models exhibit performance degradation when performing with noisy images. To solve this problem, we propose a novel model, which combines computer vision using optical sensor with natural language processing by bidirectional encoder representations from transformers(BERT) and cosine similarity scoring. The proposed model uses a confidence rate to determine whether to utilize optical sensor alone or BERT/cosine similarity scoring combined with the optical sensor. Experimental results show that the proposed model outperforms approximately 4.34 times better than the conventional OCR.

Cite

Text

Moon et al. "Enhanced Optical Character Recognition by Optical Sensor Combined with BERT and Cosine Similarity Scoring (Student Abstract)." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I21.30483

Markdown

[Moon et al. "Enhanced Optical Character Recognition by Optical Sensor Combined with BERT and Cosine Similarity Scoring (Student Abstract)." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/moon2024aaai-enhanced/) doi:10.1609/AAAI.V38I21.30483

BibTeX

@inproceedings{moon2024aaai-enhanced,
  title     = {{Enhanced Optical Character Recognition by Optical Sensor Combined with BERT and Cosine Similarity Scoring (Student Abstract)}},
  author    = {Moon, Woohyeon and Nengroo, Sarvar Hussain and Kim, Taeyoung and Lee, Jihui and Son, Seungah and Har, Dongsoo},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {23585-23586},
  doi       = {10.1609/AAAI.V38I21.30483},
  url       = {https://mlanthology.org/aaai/2024/moon2024aaai-enhanced/}
}