June 16, 2022, 1:13 a.m. | Sayna Ebrahimi, Sercan O. Arik, Tomas Pfister

cs.CV updates on arXiv.org arxiv.org

Self-supervised pretraining has been able to produce transferable
representations for various visual document understanding (VDU) tasks. However,
the ability of such representations to adapt to new distribution shifts at
test-time has not been studied yet. We propose DocTTA, a novel test-time
adaptation approach for documents that leverages cross-modality self-supervised
learning via masked visual language modeling as well as pseudo labeling to
adapt models learned on a \textit{source} domain to an unlabeled
\textit{target} domain at test time. We also introduce new …

arxiv cv document understanding test time understanding

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne