Aug. 18, 2022, 1:12 a.m. | Minghao Li, Tengchao Lv, Jingye Chen, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei

cs.CV updates on arXiv.org arxiv.org

Text recognition is a long-standing research problem for document
digitalization. Existing approaches are usually built based on CNN for image
understanding and RNN for char-level text generation. In addition, another
language model is usually needed to improve the overall accuracy as a
post-processing step. In this paper, we propose an end-to-end text recognition
approach with pre-trained image Transformer and text Transformer models, namely
TrOCR, which leverages the Transformer architecture for both image
understanding and wordpiece-level text generation. The TrOCR model …

arxiv character recognition optical character recognition pre-trained models transformer

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Management Assistant

@ World Vision | Amman Office, Jordan

Cloud Data Engineer, Global Services Delivery, Google Cloud

@ Google | Buenos Aires, Argentina