Web: http://arxiv.org/abs/2111.11011

June 23, 2022, 1:13 a.m. | Tianlun Zheng, Zhineng Chen, Shancheng Fang, Hongtao Xie, Yu-Gang Jiang

cs.CV updates on arXiv.org arxiv.org

The Transformer-based encoder-decoder framework is becoming popular in scene
text recognition, largely because it naturally integrates recognition clues
from both visual and semantic domains. However, recent studies show that the
two kinds of clues are not always well registered and therefore, feature and
character might be misaligned in the difficult text (e.g., with rare shapes).
As a result, constraints such as character position are introduced to alleviate
this problem. Despite certain success, a content-free positional embedding
hardly associates stably with …

arxiv cv text

