Web: http://arxiv.org/abs/2111.11011

June 23, 2022, 1:13 a.m. | Tianlun Zheng, Zhineng Chen, Shancheng Fang, Hongtao Xie, Yu-Gang Jiang

cs.CV updates on arXiv.org arxiv.org

The Transformer-based encoder-decoder framework is becoming popular in scene
text recognition, largely because it naturally integrates recognition clues
from both visual and semantic domains. However, recent studies show that the
two kinds of clues are not always well registered and therefore, feature and
character might be misaligned in the difficult text (e.g., with rare shapes).
As a result, constraints such as character position are introduced to alleviate
this problem. Despite certain success, a content-free positional embedding
hardly associates stably with …

arxiv cv text

More from arxiv.org / cs.CV updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY