Web: http://arxiv.org/abs/2107.06777

Jan. 26, 2022, 2:10 a.m. | Christian Bartz, Hendrik Rätz, Jona Otholt, Christoph Meinel, Haojin Yang

cs.CV updates on arXiv.org arxiv.org

One of the most pressing problems in the automated analysis of historical
documents is the availability of annotated training data. The problem is that
labeling samples is a time-consuming task because it requires human expertise
and thus, cannot be automated well. In this work, we propose a novel method to
construct synthetic labeled datasets for historical documents where no
annotations are available. We train a StyleGAN model to synthesize document
images that capture the core features of the original documents. …

arxiv cv data segmentation semantic synthetic data

More from arxiv.org / cs.CV updates on arXiv.org

Data Analytics and Technical support Lead

@ Coupa Software, Inc. | Bogota, Colombia

Data Science Manager

@ Vectra | San Jose, CA

Data Analyst Sr

@ Capco | Brazil - Sao Paulo

Data Scientist (NLP)

@ Builder.ai | London, England, United Kingdom - Remote

Senior Data Analyst

@ BuildZoom | Scottsdale, AZ/ San Francisco, CA/ Remote

Senior Research Scientist, Speech Recognition

@ SoundHound Inc. | Toronto, Canada