all AI news
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models. (arXiv:2109.10282v4 [cs.CL] UPDATED)
cs.CL updates on arXiv.org arxiv.org
Text recognition is a long-standing research problem for document
digitalization. Existing approaches are usually built based on CNN for image
understanding and RNN for char-level text generation. In addition, another
language model is usually needed to improve the overall accuracy as a
post-processing step. In this paper, we propose an end-to-end text recognition
approach with pre-trained image Transformer and text Transformer models, namely
TrOCR, which leverages the Transformer architecture for both image
understanding and wordpiece-level text generation. The TrOCR model …
arxiv character recognition optical character recognition pre-trained models transformer