3e-Solver: An Effortless, Easy-to-Update, and End-to-End Solver with Semi-Supervised Learning for Breaking Text-Based Captchas

Abstract

Text-based captchas are the most widely used security mechanism currently. Due to the limitations and specificity of the segmentation algorithm, the early segmentation-based attack method has been unable to deal with the current captchas with newly introduced security features (e.g., occluding lines and overlapping). Recently, some works have designed captcha solvers based on deep learning methods with powerful feature extraction capabilities, which have greater generality and higher accuracy. However, these works still suffer from two main intrinsic limitations: (1) many labor costs are required to label the training data, and (2) the solver cannot be updated with unlabeled data to recognize captchas more accurately. In this paper, we present a novel solver using improved FixMatch for semi-supervised captcha recognition to tackle these problems. Specifically, we first build an end-to-end baseline model to effectively break text-based captchas by leveraging encoder-decoder architecture and attention mechanism. Then we construct our solver with a few labeled samples and many unlabeled samples by improved FixMatch, which introduces teacher forcing, adaptive batch normalization, and consistency loss to achieve more effective training. Experiment results show that our solver outperforms state-of-the-arts by a large margin on current captcha schemes. We hope that our work can help security experts to revisit the design and usability of text-based captchas. The source code of this work is available at https://github.com/SJTU-dxw/3E-Solver-CAPTCHA.

Cite

Text

Deng et al. "3e-Solver: An Effortless, Easy-to-Update, and End-to-End Solver with Semi-Supervised Learning for Breaking Text-Based Captchas." International Joint Conference on Artificial Intelligence, 2022. doi:10.24963/IJCAI.2022/530

Markdown

[Deng et al. "3e-Solver: An Effortless, Easy-to-Update, and End-to-End Solver with Semi-Supervised Learning for Breaking Text-Based Captchas." International Joint Conference on Artificial Intelligence, 2022.](https://mlanthology.org/ijcai/2022/deng2022ijcai-e/) doi:10.24963/IJCAI.2022/530

BibTeX

@inproceedings{deng2022ijcai-e,
  title     = {{3e-Solver: An Effortless, Easy-to-Update, and End-to-End Solver with Semi-Supervised Learning for Breaking Text-Based Captchas}},
  author    = {Deng, Xianwen and Zhao, Ruijie and Wang, Yanhao and Chen, Libo and Wang, Yijun and Xue, Zhi},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {3817-3824},
  doi       = {10.24963/IJCAI.2022/530},
  url       = {https://mlanthology.org/ijcai/2022/deng2022ijcai-e/}
}