LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking. (arXiv:2204.08387v3 [cs.CL] UPDATED) | allainews.com

July 20, 2022, 1:12 a.m. | Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei

cs.CL updates on arXiv.org arxiv.org

Self-supervised pre-training techniques have achieved remarkable progress in
Document AI. Most multimodal pre-trained models use a masked language modeling
objective to learn bidirectional representations on the text modality, but they
differ in pre-training objectives for the image modality. This discrepancy adds
difficulty to multimodal representation learning. In this paper, we propose
\textbf{LayoutLMv3} to pre-train multimodal Transformers for Document AI with
unified text and image masking. Additionally, LayoutLMv3 is pre-trained with a
word-patch alignment objective to learn cross-modal alignment by predicting …

ai arxiv document ai image pre-training text training

More from arxiv.org / cs.CL updates on arXiv.org

ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis 3 hours ago | arxiv.org

abstract arxiv cs.cl cs.sd +14

LSTM-based Deep Neural Network With A Focus on Sentence Representation for Sequential Sentence Classification in … 3 hours ago | arxiv.org

abstract arxiv classification cs.cl +13

Improving Text Embeddings with Large Language Models 3 hours ago | arxiv.org

abstract arxiv cs.cl cs.ir +22

The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation 3 hours ago | arxiv.org

abstract arxiv behavior belief +22

When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task Medical Applications 3 hours ago | arxiv.org

abstract applications arxiv attention +19

TRAM: Benchmarking Temporal Reasoning for Large Language Models 3 hours ago | arxiv.org

abstract arxiv benchmarking benchmarks +17

Multi-hop Question Answering 3 hours ago | arxiv.org

abstract ai systems arxiv cs.ai +18

Towards a Fluid computer 3 hours ago | arxiv.org

abstract article arxiv computer +13

CWRCzech: 100M Query-Document Czech Click Dataset and Its Application to Web Relevance Ranking 3 hours ago | arxiv.org

application arxiv click cs.cl +8

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Technical Program Manager, Expert AI Trainer Acquisition & Engagement

@ OpenAI | San Francisco, CA

View on ai-jobs.net

Director, Data Engineering

@ PatientPoint | Cincinnati, Ohio, United States

View on ai-jobs.net