July 29, 2022, 1:11 a.m. | Song Tao, Zijian Wang, Tiantian Fan, Canjie Luo, Can Huang

cs.CL updates on arXiv.org arxiv.org

Due to the complex layouts of documents, it is challenging to extract
information for documents. Most previous studies develop multimodal pre-trained
models in a self-supervised way. In this paper, we focus on the embedding
learning of word blocks containing text and layout information, and propose
UTel, a language model with Unified TExt and Layout pre-training. Specifically,
we propose two pre-training tasks: Surrounding Word Prediction (SWP) for the
layout learning, and Contrastive learning of Word Embeddings (CWE) for
identifying different word …

arxiv document understanding understanding

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne